From lixi at uchicago.edu Sat Feb 2 14:37:05 2008 From: lixi at uchicago.edu (lixi at uchicago.edu) Date: Sat, 2 Feb 2008 14:37:05 -0600 (CST) Subject: [Swift-user] kickstart.xml transfer error Message-ID: <20080202143705.AVY16994@m4500-03.uchicago.edu> Hi, When running a swift program with kickstart file transfer turned on (kickstart.always.transfer=true), I encounter the following errors: ... node completed node completed Failed to transfer kickstart records from workflowtest- 20080202-1406-sf494p62/kickstart/o/BNL_ATLAS_2Exception in getFile task:transfer @ vdl-int.k, line: 322 sys:try @ vdl-int.k, line: 322 vdl:transferkickstartrec @ vdl-int.k, line: 409 sys:set @ vdl-int.k, line: 409 sys:sequential @ vdl-int.k, line: 409 sys:try @ vdl-int.k, line: 408 sys:else @ vdl-int.k, line: 407 sys:if @ vdl-int.k, line: 405 sys:set @ vdl-int.k, line: 404 sys:catch @ vdl-int.k, line: 396 sys:try @ vdl-int.k, line: 354 task:allocatehost @ vdl-int.k, line: 334 vdl:execute2 @ execute-default.k, line: 23 sys:restartonerror @ execute-default.k, line: 21 sys:sequential @ execute-default.k, line: 19 sys:try @ execute-default.k, line: 18 sys:if @ execute-default.k, line: 17 sys:then @ execute-default.k, line: 16 sys:if @ execute-default.k, line: 15 vdl:execute @ workflowtest.kml, line: 31 worknode @ workflowtest.kml, line: 105 sys:sequential @ workflowtest.kml, line: 104 sys:parallelfor @ workflowtest.kml, line: 88 sys:sequential @ workflowtest.kml, line: 87 sys:parallel @ workflowtest.kml, line: 77 vdl:mainp @ workflowtest.kml, line: 76 mainp @ vdl.k, line: 150 vdl:mains @ workflowtest.kml, line: 75 vdl:mains @ workflowtest.kml, line: 75 rlog:restartlog @ workflowtest.kml, line: 74 kernel:project @ workflowtest.kml, line: 2 workflowtest-20080202-1406-sf494p62 Caused by: org.globus.cog.abstraction.impl.file.FileResourceException: Exception in getFile Caused by: org.globus.ftp.exception.ServerException: Server refused performing the request. Custom message: (error code 1) [Nested exception message: Custom message: Unexpected reply: 500-Command failed. : globus_gridftp_server_file.c:globus_l_gfs_file_send:2190: 500-globus_l_gfs_file_open failed. 500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694: 500-globus_xio_register_open failed. 500-globus_xio_file_driver.c:globus_l_xio_file_open:438: 500-Unable to open file /usatlas/prodjob/share/lixi/workflowtest-20080202-1406- sf494p62/kickstart/o/node-o2l4duni-kickstart.xml 500-globus_xio_file_driver.c:globus_l_xio_file_open:381: 500-System error in open: No such file or directory 500-globus_xio: A system call failed: No such file or directory 500 End.] [Nested exception is org.globus.ftp.exception.UnexpectedReplyCodeException: Custom message: Unexpected reply: 500-Command failed. : globus_gridftp_server_file.c:globus_l_gfs_file_send:2190: 500-globus_l_gfs_file_open failed. 500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694: 500-globus_xio_register_open failed. 500-globus_xio_file_driver.c:globus_l_xio_file_open:438: 500-Unable to open file /usatlas/prodjob/share/lixi/workflowtest-20080202-1406- sf494p62/kickstart/o/node-o2l4duni-kickstart.xml 500-globus_xio_file_driver.c:globus_l_xio_file_open:381: 500-System error in open: No such file or directory 500-globus_xio: A system call failed: No such file or directory 500 End.] node completed node completed ... But finally, I can still get the kickstart.xml files under my directory where issued the swift command. Could you tell me why I encountered such errors and how to fix them? Thanks a lot! Xi From benc at hawaga.org.uk Sun Feb 3 09:46:29 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Sun, 3 Feb 2008 15:46:29 +0000 (GMT) Subject: [Swift-user] kickstart.xml transfer error In-Reply-To: <20080202143705.AVY16994@m4500-03.uchicago.edu> References: <20080202143705.AVY16994@m4500-03.uchicago.edu> Message-ID: can you tell me where I can see the log file for this run? -- From benc at hawaga.org.uk Mon Feb 4 12:16:05 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 4 Feb 2008 18:16:05 +0000 (GMT) Subject: [Swift-user] kickstart.xml transfer error In-Reply-To: References: <20080202143705.AVY16994@m4500-03.uchicago.edu> Message-ID: On Sun, 3 Feb 2008, Ben Clifford wrote: > can you tell me where I can see the log file for this run? OK. I looked at your log file. It ends like this: 2008-02-02 14:17:32,797-0600 DEBUG Loader Swift finished with no errors which means that even though some stuff may have failed during the execution, those jobs were retried and subsequently worked successfully. That is good. Some level of apprarently random failure is to be expected in a real grid environment. -- From lixi at uchicago.edu Mon Feb 4 12:24:09 2008 From: lixi at uchicago.edu (lixi at uchicago.edu) Date: Mon, 4 Feb 2008 12:24:09 -0600 (CST) Subject: [Swift-user] kickstart.xml transfer error Message-ID: <20080204122409.AVZ50052@m4500-03.uchicago.edu> Thanks! But it seems that this kind of errors are encountered too often with the kickstart file transferred back. Now even at the beginning of running, I encountered such problem, and now it is waiting: [lixi at login 5000nodes]$ swift - tc.file /home/lixi/swift/test/tc.data - sites.file /home/lixi/swift/test/sites-all.xml workflowtest.swift Swift v0.3-dev r1580 (modified locally) RunID: 20080204-1214-qaylsj73 node started Failed to transfer kickstart records from workflowtest- 20080204-1214-qaylsj73/kickstart/m/NebraskaException in getFile task:transfer @ vdl-int.k, line: 322 sys:try @ vdl-int.k, line: 322 vdl:transferkickstartrec @ vdl-int.k, line: 409 sys:set @ vdl-int.k, line: 409 sys:sequential @ vdl-int.k, line: 409 sys:try @ vdl-int.k, line: 408 sys:else @ vdl-int.k, line: 407 sys:if @ vdl-int.k, line: 405 sys:set @ vdl-int.k, line: 404 sys:catch @ vdl-int.k, line: 396 sys:try @ vdl-int.k, line: 354 task:allocatehost @ vdl-int.k, line: 334 vdl:execute2 @ execute-default.k, line: 23 sys:restartonerror @ execute-default.k, line: 21 sys:sequential @ execute-default.k, line: 19 sys:try @ execute-default.k, line: 18 sys:if @ execute-default.k, line: 17 sys:then @ execute-default.k, line: 16 sys:if @ execute-default.k, line: 15 vdl:execute @ workflowtest.kml, line: 31 worknode @ workflowtest.kml, line: 79 sys:sequential @ workflowtest.kml, line: 78 sys:parallel @ workflowtest.kml, line: 77 vdl:mainp @ workflowtest.kml, line: 76 mainp @ vdl.k, line: 150 vdl:mains @ workflowtest.kml, line: 75 vdl:mains @ workflowtest.kml, line: 75 rlog:restartlog @ workflowtest.kml, line: 74 kernel:project @ workflowtest.kml, line: 2 workflowtest-20080204-1214-qaylsj73 Caused by: org.globus.cog.abstraction.impl.file.FileResourceException: Exception in getFile Caused by: org.globus.ftp.exception.ServerException: Server refused performing the request. Custom message: (error code 1) [Nested exception message: Custom message: Unexpected reply: 500-Command failed. : globus_gridftp_server_file.c:globus_l_gfs_file_send:2190: 500-globus_l_gfs_file_open failed. 500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694: 500-globus_xio_register_open failed. 500-globus_xio_file_driver.c:globus_l_xio_file_open:438: 500-Unable to open file /opt/osg/data/lixi/workflowtest- 20080204-1214-qaylsj73/kickstart/m/node-m40siyni- kickstart.xml 500-globus_xio_file_driver.c:globus_l_xio_file_open:381: 500-System error in open: No such file or directory 500-globus_xio: A system call failed: No such file or directory 500 End.] [Nested exception is org.globus.ftp.exception.UnexpectedReplyCodeException: Custom message: Unexpected reply: 500-Command failed. : globus_gridftp_server_file.c:globus_l_gfs_file_send:2190: 500-globus_l_gfs_file_open failed. 500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694: 500-globus_xio_register_open failed. 500-globus_xio_file_driver.c:globus_l_xio_file_open:438: 500-Unable to open file /opt/osg/data/lixi/workflowtest- 20080204-1214-qaylsj73/kickstart/m/node-m40siyni- kickstart.xml 500-globus_xio_file_driver.c:globus_l_xio_file_open:381: 500-System error in open: No such file or directory 500-globus_xio: A system call failed: No such file or directory 500 End.] Thanks. Maybe should I stop it and restart running again? Xi From benc at hawaga.org.uk Mon Feb 4 12:31:18 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 4 Feb 2008 18:31:18 +0000 (GMT) Subject: [Swift-user] kickstart.xml transfer error In-Reply-To: <20080204122409.AVZ50052@m4500-03.uchicago.edu> References: <20080204122409.AVZ50052@m4500-03.uchicago.edu> Message-ID: On Mon, 4 Feb 2008, lixi at uchicago.edu wrote: > But it seems that this kind of errors are encountered too > often with the kickstart file transferred back. It looks like you are running on several sites. It might be that one of the sites isn't working or is misconfigured. If you use the scripts from the log-processing directory to generate a web page of summary information, I think there is a table in there listing which sites were used, and which sites had successful execution. Try generating that for the logfile that you sent me before and see what information is there. -- From benc at hawaga.org.uk Tue Feb 5 13:42:59 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 5 Feb 2008 19:42:59 +0000 (GMT) Subject: [Swift-user] Re: r1609 run In-Reply-To: <108193.19326.qm@web52310.mail.re2.yahoo.com> References: <108193.19326.qm@web52310.mail.re2.yahoo.com> Message-ID: On Mon, 4 Feb 2008, Mike Kubal wrote: > I installed r1609 and attempted to the swift job > twice run. I used the commandline at the bottom of > your email. The log files you should look at are (the > others are old jobs): > > run_MD_pipeline_loop_for_impdh-20080204-2151-i07ddx30.log > run_MD_pipeline_loop_for_impdh-20080204-2157-f8lg7itf.log > ok. I see that there were execution errors that caused those two runs to fail. did you figure those out? -- From benc at hawaga.org.uk Tue Feb 5 13:48:52 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 5 Feb 2008 19:48:52 +0000 (GMT) Subject: [Swift-user] Re: r1609 run In-Reply-To: References: <108193.19326.qm@web52310.mail.re2.yahoo.com> Message-ID: On Tue, 5 Feb 2008, Ben Clifford wrote: > ok. I see that there were execution errors that caused those two runs to > fail. ... which look like: #0 The executable could not be started. I don't find that #0 bit on the front familiar. Perhaps its a GRAM4 or cog/gt4 provider message... -- From hategan at mcs.anl.gov Tue Feb 5 13:51:41 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 05 Feb 2008 13:51:41 -0600 Subject: [Swift-user] Re: r1609 run In-Reply-To: References: <108193.19326.qm@web52310.mail.re2.yahoo.com> Message-ID: <1202241102.3197.0.camel@blabla.mcs.anl.gov> On Tue, 2008-02-05 at 19:48 +0000, Ben Clifford wrote: > > On Tue, 5 Feb 2008, Ben Clifford wrote: > > > ok. I see that there were execution errors that caused those two runs to > > fail. > > ... which look like: #0 The executable could not be started. > > I don't find that #0 bit on the front familiar. Perhaps its a GRAM4 or > cog/gt4 provider message... It's a WS-GRAM error message. From hategan at mcs.anl.gov Tue Feb 5 14:55:37 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 05 Feb 2008 14:55:37 -0600 Subject: [Swift-user] Re: r1609 run In-Reply-To: <4582.51088.qm@web52308.mail.re2.yahoo.com> References: <4582.51088.qm@web52308.mail.re2.yahoo.com> Message-ID: <1202244937.7568.0.camel@blabla.mcs.anl.gov> Can you try with 1 for a start? On Tue, 2008-02-05 at 12:53 -0800, Mike Kubal wrote: > I had some improperly named data files that caused the > error. I'm about to relaunch with r1609. Should I > ratchet down the throttle.score.job.factor to 1 or > leave it at 4? > > > --- Mihael Hategan wrote: > > > > > On Tue, 2008-02-05 at 19:48 +0000, Ben Clifford > > wrote: > > > > > > On Tue, 5 Feb 2008, Ben Clifford wrote: > > > > > > > ok. I see that there were execution errors that > > caused those two runs to > > > > fail. > > > > > > ... which look like: #0 The executable could not > > be started. > > > > > > I don't find that #0 bit on the front familiar. > > Perhaps its a GRAM4 or > > > cog/gt4 provider message... > > > > It's a WS-GRAM error message. > > > > > > > > ____________________________________________________________________________________ > Looking for last minute shopping deals? > Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping > From hategan at mcs.anl.gov Tue Feb 5 15:38:51 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 05 Feb 2008 15:38:51 -0600 Subject: [Swift-user] Re: r1609 run In-Reply-To: <912081.86479.qm@web52312.mail.re2.yahoo.com> References: <912081.86479.qm@web52312.mail.re2.yahoo.com> Message-ID: <1202247531.10168.1.camel@blabla.mcs.anl.gov> May it be one of those 64 bit executables on a 32 bit machine or the other way around issue? On Tue, 2008-02-05 at 13:36 -0800, Mike Kubal wrote: > I corrected the filenames so the arguments are > correct, but it is still failing with the message, > > #0 The executable could not be started. > > The swift script is the same. > > The sites file was modified to: > > gridlaunch="/home/wilde/vds/mystart"> > storage="/home/kubal/Swift_Runs" major="2" minor="2" > /> > url="tg-grid.uc.teragrid.org" /> > > /home/kubal/Swift_Runs > key="project">TG-MCA01S018 > > > Any thoughts? > > I rsynced the log file to Ben's dir at UC > > log_file: > run_MD_pipeline_loop_for_impdh-20080205-1522-0x6toxd1.log > > Thanks, > > Mike > > --- Ben Clifford wrote: > > > > > > > On Tue, 5 Feb 2008, Ben Clifford wrote: > > > > > ok. I see that there were execution errors that > > caused those two runs to > > > fail. > > > > ... which look like: #0 The executable could not be > > started. > > > > I don't find that #0 bit on the front familiar. > > Perhaps its a GRAM4 or > > cog/gt4 provider message... > > -- > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > ____________________________________________________________________________________ > Be a better friend, newshound, and > know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > From benc at hawaga.org.uk Tue Feb 5 17:23:14 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 5 Feb 2008 23:23:14 +0000 (GMT) Subject: [Swift-user] Re: r1609 run In-Reply-To: <1202241102.3197.0.camel@blabla.mcs.anl.gov> References: <108193.19326.qm@web52310.mail.re2.yahoo.com> <1202241102.3197.0.camel@blabla.mcs.anl.gov> Message-ID: can you send the tc.data entry that you think is being used for this executable? -- From mikekubal at yahoo.com Tue Feb 5 14:53:24 2008 From: mikekubal at yahoo.com (Mike Kubal) Date: Tue, 5 Feb 2008 12:53:24 -0800 (PST) Subject: [Swift-user] Re: r1609 run In-Reply-To: <1202241102.3197.0.camel@blabla.mcs.anl.gov> Message-ID: <4582.51088.qm@web52308.mail.re2.yahoo.com> I had some improperly named data files that caused the error. I'm about to relaunch with r1609. Should I ratchet down the throttle.score.job.factor to 1 or leave it at 4? --- Mihael Hategan wrote: > > On Tue, 2008-02-05 at 19:48 +0000, Ben Clifford > wrote: > > > > On Tue, 5 Feb 2008, Ben Clifford wrote: > > > > > ok. I see that there were execution errors that > caused those two runs to > > > fail. > > > > ... which look like: #0 The executable could not > be started. > > > > I don't find that #0 bit on the front familiar. > Perhaps its a GRAM4 or > > cog/gt4 provider message... > > It's a WS-GRAM error message. > > ____________________________________________________________________________________ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping From mikekubal at yahoo.com Tue Feb 5 15:36:21 2008 From: mikekubal at yahoo.com (Mike Kubal) Date: Tue, 5 Feb 2008 13:36:21 -0800 (PST) Subject: [Swift-user] Re: r1609 run In-Reply-To: Message-ID: <912081.86479.qm@web52312.mail.re2.yahoo.com> I corrected the filenames so the arguments are correct, but it is still failing with the message, #0 The executable could not be started. The swift script is the same. The sites file was modified to: /home/kubal/Swift_Runs TG-MCA01S018 Any thoughts? I rsynced the log file to Ben's dir at UC log_file: run_MD_pipeline_loop_for_impdh-20080205-1522-0x6toxd1.log Thanks, Mike --- Ben Clifford wrote: > > > On Tue, 5 Feb 2008, Ben Clifford wrote: > > > ok. I see that there were execution errors that > caused those two runs to > > fail. > > ... which look like: #0 The executable could not be > started. > > I don't find that #0 bit on the front familiar. > Perhaps its a GRAM4 or > cog/gt4 provider message... > -- > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From mikekubal at yahoo.com Tue Feb 5 15:52:56 2008 From: mikekubal at yahoo.com (Mike Kubal) Date: Tue, 5 Feb 2008 13:52:56 -0800 (PST) Subject: [Swift-user] Re: r1609 run In-Reply-To: <1202247531.10168.1.camel@blabla.mcs.anl.gov> Message-ID: <589178.98688.qm@web52312.mail.re2.yahoo.com> It sounds like it, but I have GLOBUS::host_types=ia64-compute set in my tc.file for this application. --- Mihael Hategan wrote: > May it be one of those 64 bit executables on a 32 > bit machine or the > other way around issue? > > On Tue, 2008-02-05 at 13:36 -0800, Mike Kubal wrote: > > I corrected the filenames so the arguments are > > correct, but it is still failing with the message, > > > > > #0 The executable could not be started. > > > > The swift script is the same. > > > > The sites file was modified to: > > > > > gridlaunch="/home/wilde/vds/mystart"> > > url="gsiftp://tg-gridftp.uc.teragrid.org" > > storage="/home/kubal/Swift_Runs" major="2" > minor="2" > > /> > > > url="tg-grid.uc.teragrid.org" /> > > > > > /home/kubal/Swift_Runs > > > key="project">TG-MCA01S018 > > > > > > Any thoughts? > > > > I rsynced the log file to Ben's dir at UC > > > > log_file: > > > run_MD_pipeline_loop_for_impdh-20080205-1522-0x6toxd1.log > > > > Thanks, > > > > Mike > > > > --- Ben Clifford wrote: > > > > > > > > > > > On Tue, 5 Feb 2008, Ben Clifford wrote: > > > > > > > ok. I see that there were execution errors > that > > > caused those two runs to > > > > fail. > > > > > > ... which look like: #0 The executable could > not be > > > started. > > > > > > I don't find that #0 bit on the front familiar. > > > Perhaps its a GRAM4 or > > > cog/gt4 provider message... > > > -- > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > > > ____________________________________________________________________________________ > > Be a better friend, newshound, and > > know-it-all with Yahoo! Mobile. Try it now. > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > > > > ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From mikekubal at yahoo.com Tue Feb 5 20:40:43 2008 From: mikekubal at yahoo.com (Mike Kubal) Date: Tue, 5 Feb 2008 18:40:43 -0800 (PST) Subject: [Swift-user] Re: r1609 run In-Reply-To: Message-ID: <530305.5469.qm@web52302.mail.re2.yahoo.com> UC-64 amberize_ligand /home/kubal/MD_tools/amberize_ligand INSTALLED INTEL64::LINUX ENV::ACHOME=/home/kubal/dock6/bin/antechamber;ENV::DOCK_HOME="/home/kubal/dock6";ENV::PATH="/home/kubal/dock6/bin:/usr/bin:/bin";GLOBUS::host_types=ia64-compute --- Ben Clifford wrote: > > can you send the tc.data entry that you think is > being used for this > executable? > > -- > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From hategan at mcs.anl.gov Tue Feb 5 21:51:33 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 05 Feb 2008 21:51:33 -0600 Subject: [Swift-user] Re: r1609 run In-Reply-To: <530305.5469.qm@web52302.mail.re2.yahoo.com> References: <530305.5469.qm@web52302.mail.re2.yahoo.com> Message-ID: <1202269893.21584.9.camel@blabla.mcs.anl.gov> Actually now that I think of it swift tries to run /bin/bash which runs wrapper.sh which in turns runs the actual executable. So it's a different problem. On Tue, 2008-02-05 at 18:40 -0800, Mike Kubal wrote: > UC-64 amberize_ligand > /home/kubal/MD_tools/amberize_ligand INSTALLED > INTEL64::LINUX > ENV::ACHOME=/home/kubal/dock6/bin/antechamber;ENV::DOCK_HOME="/home/kubal/dock6";ENV::PATH="/home/kubal/dock6/bin:/usr/bin:/bin";GLOBUS::host_types=ia64-compute > > > --- Ben Clifford wrote: > > > > > can you send the tc.data entry that you think is > > being used for this > > executable? > > > > -- > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > ____________________________________________________________________________________ > Be a better friend, newshound, and > know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > From hategan at mcs.anl.gov Tue Feb 5 22:08:51 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 05 Feb 2008 22:08:51 -0600 Subject: [Swift-user] Re: r1609 run In-Reply-To: <1202269893.21584.9.camel@blabla.mcs.anl.gov> References: <530305.5469.qm@web52302.mail.re2.yahoo.com> <1202269893.21584.9.camel@blabla.mcs.anl.gov> Message-ID: <1202270931.24933.1.camel@blabla.mcs.anl.gov> Ok, so it looks like through the WS-GRAM route the project attribute gets lost somewhere. So until that gets fixed, you can probably try logging in to the head node and setting a default project. Mihael On Tue, 2008-02-05 at 21:51 -0600, Mihael Hategan wrote: > Actually now that I think of it swift tries to run /bin/bash which runs > wrapper.sh which in turns runs the actual executable. So it's a > different problem. > > On Tue, 2008-02-05 at 18:40 -0800, Mike Kubal wrote: > > UC-64 amberize_ligand > > /home/kubal/MD_tools/amberize_ligand INSTALLED > > INTEL64::LINUX > > ENV::ACHOME=/home/kubal/dock6/bin/antechamber;ENV::DOCK_HOME="/home/kubal/dock6";ENV::PATH="/home/kubal/dock6/bin:/usr/bin:/bin";GLOBUS::host_types=ia64-compute > > > > > > --- Ben Clifford wrote: > > > > > > > > can you send the tc.data entry that you think is > > > being used for this > > > executable? > > > > > > -- > > > > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > > ____________________________________________________________________________________ > > Be a better friend, newshound, and > > know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From hategan at mcs.anl.gov Tue Feb 5 22:12:09 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 05 Feb 2008 22:12:09 -0600 Subject: [Swift-user] Re: r1609 run In-Reply-To: <1202270931.24933.1.camel@blabla.mcs.anl.gov> References: <530305.5469.qm@web52302.mail.re2.yahoo.com> <1202269893.21584.9.camel@blabla.mcs.anl.gov> <1202270931.24933.1.camel@blabla.mcs.anl.gov> Message-ID: <1202271129.25918.0.camel@blabla.mcs.anl.gov> Actually I take that back. However, it does look like a project problem. Are you sure you're using the right one? On Tue, 2008-02-05 at 22:08 -0600, Mihael Hategan wrote: > Ok, so it looks like through the WS-GRAM route the project attribute > gets lost somewhere. > > So until that gets fixed, you can probably try logging in to the head > node and setting a default project. > > Mihael > > On Tue, 2008-02-05 at 21:51 -0600, Mihael Hategan wrote: > > Actually now that I think of it swift tries to run /bin/bash which runs > > wrapper.sh which in turns runs the actual executable. So it's a > > different problem. > > > > On Tue, 2008-02-05 at 18:40 -0800, Mike Kubal wrote: > > > UC-64 amberize_ligand > > > /home/kubal/MD_tools/amberize_ligand INSTALLED > > > INTEL64::LINUX > > > ENV::ACHOME=/home/kubal/dock6/bin/antechamber;ENV::DOCK_HOME="/home/kubal/dock6";ENV::PATH="/home/kubal/dock6/bin:/usr/bin:/bin";GLOBUS::host_types=ia64-compute > > > > > > > > > --- Ben Clifford wrote: > > > > > > > > > > > can you send the tc.data entry that you think is > > > > being used for this > > > > executable? > > > > > > > > -- > > > > > > > > > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > > > > > > > > ____________________________________________________________________________________ > > > Be a better friend, newshound, and > > > know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From benc at hawaga.org.uk Wed Feb 6 06:42:20 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 6 Feb 2008 12:42:20 +0000 (GMT) Subject: [Swift-user] Re: r1609 run In-Reply-To: <530305.5469.qm@web52302.mail.re2.yahoo.com> References: <530305.5469.qm@web52302.mail.re2.yahoo.com> Message-ID: did this run through gram2 ok (modulo the killing of the remote site) ? -- From benc at hawaga.org.uk Wed Feb 6 06:46:35 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 6 Feb 2008 12:46:35 +0000 (GMT) Subject: [Swift-user] Re: r1609 run In-Reply-To: <1202270931.24933.1.camel@blabla.mcs.anl.gov> References: <530305.5469.qm@web52302.mail.re2.yahoo.com> <1202269893.21584.9.camel@blabla.mcs.anl.gov> <1202270931.24933.1.camel@blabla.mcs.anl.gov> Message-ID: On Tue, 5 Feb 2008, Mihael Hategan wrote: > Ok, so it looks like through the WS-GRAM route the project attribute > gets lost somewhere. > > So until that gets fixed, you can probably try logging in to the head > node and setting a default project. I have no default project on tg-uc, and my project id gets passed through ok (from sites.xml) through gram4 on a test run. -- From mikekubal at yahoo.com Wed Feb 6 08:39:23 2008 From: mikekubal at yahoo.com (Mike Kubal) Date: Wed, 6 Feb 2008 06:39:23 -0800 (PST) Subject: [Swift-user] Re: r1609 run In-Reply-To: Message-ID: <128036.86585.qm@web52308.mail.re2.yahoo.com> Yes, this swift script and the remote app worked with gram2. --- Ben Clifford wrote: > did this run through gram2 ok (modulo the killing of > the remote site) ? > -- > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From benc at hawaga.org.uk Wed Feb 6 09:10:06 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 6 Feb 2008 15:10:06 +0000 (GMT) Subject: [Swift-user] Re: r1609 run In-Reply-To: <128036.86585.qm@web52308.mail.re2.yahoo.com> References: <128036.86585.qm@web52308.mail.re2.yahoo.com> Message-ID: In the examples directory, there is a 'first.swift' hello world workflow. You need to add a tc.data entry for 'echo' that points to /bin/echo. Please try running that with your gram4 configuration. -- From hategan at mcs.anl.gov Wed Feb 6 09:18:21 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 06 Feb 2008 09:18:21 -0600 Subject: [Swift-user] Re: r1609 run In-Reply-To: References: <530305.5469.qm@web52302.mail.re2.yahoo.com> <1202269893.21584.9.camel@blabla.mcs.anl.gov> <1202270931.24933.1.camel@blabla.mcs.anl.gov> Message-ID: <1202311101.26915.3.camel@blabla.mcs.anl.gov> On Wed, 2008-02-06 at 12:46 +0000, Ben Clifford wrote: > On Tue, 5 Feb 2008, Mihael Hategan wrote: > > > Ok, so it looks like through the WS-GRAM route the project attribute > > gets lost somewhere. > > > > So until that gets fixed, you can probably try logging in to the head > > node and setting a default project. > > I have no default project on tg-uc, and my project id gets passed through > ok (from sites.xml) through gram4 on a test run. Yes. I noticed. I was mistaken. For some reason the ws-gram provider doesn't properly propagate the literal errors from the service. Adding the following to etc/log4j.properties may shed some light over the issue: log4j.logger.org.globus.exec=DEBUG > From hategan at mcs.anl.gov Wed Feb 6 15:46:24 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 06 Feb 2008 15:46:24 -0600 Subject: [Swift-user] Re: r1609 run In-Reply-To: <1202311101.26915.3.camel@blabla.mcs.anl.gov> References: <530305.5469.qm@web52302.mail.re2.yahoo.com> <1202269893.21584.9.camel@blabla.mcs.anl.gov> <1202270931.24933.1.camel@blabla.mcs.anl.gov> <1202311101.26915.3.camel@blabla.mcs.anl.gov> Message-ID: <1202334384.13010.2.camel@blabla.mcs.anl.gov> I updated the ws-gram provider to presumably provide better error messages. However, I don't quite seem to be able to understand the ws-gram fault types. They seem to look like exceptions except they're nothing like exceptions. So let me know how this works in practice. On Wed, 2008-02-06 at 09:18 -0600, Mihael Hategan wrote: > On Wed, 2008-02-06 at 12:46 +0000, Ben Clifford wrote: > > On Tue, 5 Feb 2008, Mihael Hategan wrote: > > > > > Ok, so it looks like through the WS-GRAM route the project attribute > > > gets lost somewhere. > > > > > > So until that gets fixed, you can probably try logging in to the head > > > node and setting a default project. > > > > I have no default project on tg-uc, and my project id gets passed through > > ok (from sites.xml) through gram4 on a test run. > > Yes. I noticed. I was mistaken. > > For some reason the ws-gram provider doesn't properly propagate the > literal errors from the service. Adding the following to > etc/log4j.properties may shed some light over the issue: > > log4j.logger.org.globus.exec=DEBUG > > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From benc at hawaga.org.uk Mon Feb 11 10:38:09 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 11 Feb 2008 16:38:09 +0000 (GMT) Subject: [Swift-user] Re: r1609 run In-Reply-To: <912081.86479.qm@web52312.mail.re2.yahoo.com> References: <912081.86479.qm@web52312.mail.re2.yahoo.com> Message-ID: On Tue, 5 Feb 2008, Mike Kubal wrote: > I corrected the filenames so the arguments are > correct, but it is still failing with the message, > > #0 The executable could not be started. OK, I get this message now in some tests I've been working on, with my development swift codebase and cog r1864. It happens when I submit to PBS on TG-UC, but not on jobmanager Fork (with everything else the same), both through GRAM4. -- From hategan at mcs.anl.gov Mon Feb 11 10:43:24 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 11 Feb 2008 10:43:24 -0600 Subject: [Swift-user] Re: r1609 run In-Reply-To: References: <912081.86479.qm@web52312.mail.re2.yahoo.com> Message-ID: <1202748204.16740.0.camel@blabla.mcs.anl.gov> On Mon, 2008-02-11 at 16:38 +0000, Ben Clifford wrote: > > On Tue, 5 Feb 2008, Mike Kubal wrote: > > > I corrected the filenames so the arguments are > > correct, but it is still failing with the message, > > > > #0 The executable could not be started. > > OK, I get this message now in some tests I've been working on, Yes, but there should be more to the message. > with my > development swift codebase and cog r1864. > > It happens when I submit to PBS on TG-UC, but not on jobmanager Fork (with > everything else the same), both through GRAM4. > From benc at hawaga.org.uk Mon Feb 11 11:12:57 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 11 Feb 2008 17:12:57 +0000 (GMT) Subject: [Swift-user] Re: r1609 run In-Reply-To: References: <912081.86479.qm@web52312.mail.re2.yahoo.com> Message-ID: On Mon, 11 Feb 2008, Ben Clifford wrote: > It happens when I submit to PBS on TG-UC, but not on jobmanager Fork (with > everything else the same), both through GRAM4. ok, with the extra error reporting that mihael added, I see that its because I didn't specify a project profile key. oops. it works for me now I've added that. -- From benc at hawaga.org.uk Mon Feb 11 10:49:40 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 11 Feb 2008 16:49:40 +0000 (GMT) Subject: [Swift-user] Re: r1609 run In-Reply-To: <1202748204.16740.0.camel@blabla.mcs.anl.gov> References: <912081.86479.qm@web52312.mail.re2.yahoo.com> <1202748204.16740.0.camel@blabla.mcs.anl.gov> Message-ID: On Mon, 11 Feb 2008, Mihael Hategan wrote: > > development swift codebase and cog r1864. > > > > It happens when I submit to PBS on TG-UC, but not on jobmanager Fork (with > > everything else the same), both through GRAM4. this is with cog r1864, not the head which is r1874 - I have other problems with post-r1864 cog. I'll apply r1874 and r1873 on top of r1864 and see what comes out. -- From zhangzhao0718 at gmail.com Tue Feb 12 16:57:39 2008 From: zhangzhao0718 at gmail.com (Zhao Zhang) Date: Tue, 12 Feb 2008 16:57:39 -0600 Subject: [Swift-user] Re: build swift In-Reply-To: <47B21653.9000200@cs.uchicago.edu> References: <47B2148F.6080308@gmail.com> <47B21653.9000200@cs.uchicago.edu> Message-ID: <47B22463.2030105@gmail.com> Hi, All Here is the error log file when I tried to build swift. Anyone could help this? To Mihael This is the latest log file. zhao Ioan Raicu wrote: > Zhao, > You might want to send emails with these kinds of questions to the > swift-user mailing, as there are others (i.e. Ben) that could answer. > > Ioan > > Zhao Zhang wrote: >> Hi, Mihael >> >> could you help me on this? >> >> zhao >> >> Zhao Zhang wrote: >>> Hi, Ioan >>> >>> When I run falkon-build-swift.sh, I got this problem. See the >>> attachment for log file. >>> >>> zhao >>> >>> Ioan Raicu wrote: >>>> cd .../falkon/ >>>> source falkon.env >>>> falkon-checkout-cog.sh falkon-checkout-swift.sh >>>> falkon-checkout-provider.sh falkon-build-swift.s >>>> falkon-build-provider.sh >>>> >>>> viper:/home/iraicu/yongzh/sites-uc-64.xml >>>> viper:/home/iraicu/yongzh/tc-uc.data >>>> >>>> >>>> >>>> > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: err.txt URL: From hategan at mcs.anl.gov Tue Feb 12 18:02:46 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 12 Feb 2008 18:02:46 -0600 Subject: [Swift-user] Re: build swift In-Reply-To: <47B22463.2030105@gmail.com> References: <47B2148F.6080308@gmail.com> <47B21653.9000200@cs.uchicago.edu> <47B22463.2030105@gmail.com> Message-ID: <1202860966.29685.1.camel@blabla.mcs.anl.gov> Sorry. Try now. On Tue, 2008-02-12 at 16:57 -0600, Zhao Zhang wrote: > Hi, All > > Here is the error log file when I tried to build swift. Anyone could > help this? > > To Mihael > > This is the latest log file. > > zhao > > Ioan Raicu wrote: > > Zhao, > > You might want to send emails with these kinds of questions to the > > swift-user mailing, as there are others (i.e. Ben) that could answer. > > > > Ioan > > > > Zhao Zhang wrote: > >> Hi, Mihael > >> > >> could you help me on this? > >> > >> zhao > >> > >> Zhao Zhang wrote: > >>> Hi, Ioan > >>> > >>> When I run falkon-build-swift.sh, I got this problem. See the > >>> attachment for log file. > >>> > >>> zhao > >>> > >>> Ioan Raicu wrote: > >>>> cd .../falkon/ > >>>> source falkon.env > >>>> falkon-checkout-cog.sh falkon-checkout-swift.sh > >>>> falkon-checkout-provider.sh falkon-build-swift.s > >>>> falkon-build-provider.sh > >>>> > >>>> viper:/home/iraicu/yongzh/sites-uc-64.xml > >>>> viper:/home/iraicu/yongzh/tc-uc.data > >>>> > >>>> > >>>> > >>>> > > > > plain text document attachment (err.txt) > zzhang at login1.surveyor:~/falkon> falkon-build-swift.sh > Buildfile: build.xml > [taskdef] Could not load definitions from resource checkstyletask.properties. It could not be found. > > clean: > > clean: > > clean: > [echo] [all]: CLEAN > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > clean: > > clean: > [echo] [abstraction]: CLEAN > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > clean: > > clean: > [echo] [abstraction-common]: CLEAN > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > clean: > > clean: > [echo] [jglobus]: CLEAN > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > delete.dependency.log.1: > > dep: > > dep.1: > > clean: > > clean: > [echo] [util]: CLEAN > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > delete.dependency.log.1: > > delete.dependency.log.1: > > dep: > > dep.1: > > clean: > > clean: > [echo] [provider-gt2]: CLEAN > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dep: > > dep.1: > > dep: > > dep.1: > > delete.dependency.log.1: > > dep: > > dep.1: > > BUILD FAILED > /gpfs/home/zzhang/cog/build.xml:92: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/all/build.xml:74: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:312: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:316: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/all/dependencies.xml:4: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/abstraction/build.xml:77: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:312: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:316: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/abstraction/dependencies.xml:10: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: Basedir /gpfs/home/zzhang/cog/modules/provider-gt2ft does not exist > > Total time: 2 seconds > Buildfile: build.xml > [taskdef] Could not load definitions from resource checkstyletask.properties. It could not be found. > > dist: > > dist: > > dist: > [mkdir] Created dir: /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev > [mkdir] Created dir: /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib > [mkdir] Created dir: /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [mkdir] Created dir: /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/etc > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/etc > > log4j.properties.update: > [concat] No existing resources and no nested text, doing nothing > [java] Warning: source log (/gpfs/home/zzhang/cog/modules/all/CHANGES.txt) does not exist > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > [concat] No existing resources and no nested text, doing nothing > [java] Warning: source log (/gpfs/home/zzhang/cog/modules/jglobus/../../modules/jglobus/CHANGES.txt) does not exist > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > delete.dependency.log.1: > [echo] [jglobus]: DIST > [echo] [jglobus]: JARCOPY > [copy] Copying 11 files to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib > [copy] Copying 4 files to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib > > delete.jar: > [echo] [jglobus]: DELETE.JAR (cog-jglobus-dev-080204.jar) > [delete] Deleting: /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib/cog-jglobus-dev-080204.jar > > compile: > [echo] [jglobus]: COMPILE > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/jglobus/build > > copy.resources: > > jar: > [echo] [jglobus]: JAR (cog-jglobus-dev-080204.jar) > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/jglobus/build/etc.tmp > [jar] Warning: skipping jar archive /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib/cog-jglobus-dev-080204.jar because no files were included. > [jar] Building MANIFEST-only jar: /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib/cog-jglobus-dev-080204.jar > > create: > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-proxy-init > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-proxy-info > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-proxy-destroy > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-cert-info > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-info-search > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globusrun > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus-url-copy > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus-gass-server > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus-gass-server-shutdown > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus-personal-gatekeeper > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-change-pass-phrase > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER cog-myproxy > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus2jks > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER cog-proxy-init > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > deploy.examples: > > do.deploy.examples: > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > [concat] No existing resources and no nested text, doing nothing > [java] Warning: source log (/gpfs/home/zzhang/cog/modules/util/../../modules/util/CHANGES.txt) does not exist > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > delete.dependency.log.1: > [echo] [util]: DIST > [echo] [util]: JARCOPY > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib > > delete.jar: > [echo] [util]: DELETE.JAR (cog-util-0.92.jar) > > compile: > [echo] [util]: COMPILE > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/util/build > [javac] Compiling 47 source files to /gpfs/home/zzhang/cog/modules/util/build > [javac] Note: * uses or overrides a deprecated API. > [javac] Note: Recompile with -Xlint:deprecation for details. > > copy.resources: > > jar: > [echo] [util]: JAR (cog-util-0.92.jar) > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/util/build/etc.tmp > [jar] Building jar: /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib/cog-util-0.92.jar > > create: > > launcher: > > create.launcher: > [echo] [util]: LAUNCHER cog-register > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > deploy.examples: > > do.deploy.examples: > > delete.dependency.log.1: > [echo] [abstraction-common]: DIST > [echo] [abstraction-common]: JARCOPY > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib > > delete.jar: > [echo] [abstraction-common]: DELETE.JAR (cog-abstraction-common-2.2.jar) > > compile: > [echo] [abstraction-common]: COMPILE > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/abstraction-common/build > [javac] Compiling 149 source files to /gpfs/home/zzhang/cog/modules/abstraction-common/build > [javac] Note: * uses or overrides a deprecated API. > [javac] Note: Recompile with -Xlint:deprecation for details. > > copy.resources: > > jar: > [echo] [abstraction-common]: JAR (cog-abstraction-common-2.2.jar) > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/abstraction-common/build/etc.tmp > [jar] Building jar: /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib/cog-abstraction-common-2.2.jar > > create: > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cogrun > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-job-submit > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-file-operation > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-file-transfer > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > example.launcher: > > create.example.launcher: > [echo] [abstraction-common]: EXAMPLE LAUNCHER examples/hierarchical-queue-handler > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin/examples > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin/examples > > example.launcher: > > create.example.launcher: > [echo] [abstraction-common]: EXAMPLE LAUNCHER examples/hierarchical-set-handler > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin/examples > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin/examples > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-task2xml > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > example.launcher: > > create.example.launcher: > [echo] [abstraction-common]: EXAMPLE LAUNCHER examples/taskgraph-2-xml > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin/examples > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin/examples > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-checkpoint-submit > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-checkpoint-status > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > example.launcher: > > create.example.launcher: > [echo] [abstraction-common]: EXAMPLE LAUNCHER examples/xml-2-taskgraph > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin/examples > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin/examples > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-info > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/bin > > deploy.examples: > > do.deploy.examples: > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dep: > > dep.1: > > dep: > > dep.1: > > delete.dependency.log.1: > [echo] [provider-gt2]: DIST > [echo] [provider-gt2]: JARCOPY > > delete.jar: > [echo] [provider-gt2]: DELETE.JAR (cog-provider-gt2-2.3.jar) > > compile: > [echo] [provider-gt2]: COMPILE > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/provider-gt2/build > [javac] Compiling 21 source files to /gpfs/home/zzhang/cog/modules/provider-gt2/build > [javac] Note: /gpfs/home/zzhang/cog/modules/provider-gt2/src/org/globus/cog/abstraction/impl/file/ftp/CredentialsDialog.java uses or overrides a deprecated API. > [javac] Note: Recompile with -Xlint:deprecation for details. > > copy.resources: > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/provider-gt2/build > > update.provider.props: > > jar: > [echo] [provider-gt2]: JAR (cog-provider-gt2-2.3.jar) > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/provider-gt2/build/etc.tmp > [jar] Building jar: /gpfs/home/zzhang/cog/dist/cog-4_1_6_dev/lib/cog-provider-gt2-2.3.jar > > create: > > deploy.examples: > > do.deploy.examples: > > dep: > > dep.1: > > BUILD FAILED > /gpfs/home/zzhang/cog/build.xml:81: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/all/build.xml:57: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:442: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:78: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/all/dependencies.xml:4: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/abstraction/build.xml:58: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:442: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:78: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/abstraction/dependencies.xml:10: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: Basedir /gpfs/home/zzhang/cog/modules/provider-gt2ft does not exist > > Total time: 31 seconds > Buildfile: build.xml > > cleanGenerated: > > distclean: > > clean: > [echo] [vdsk]: CLEAN > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > [delete] Deleting: /gpfs/home/zzhang/cog/dependency.log.clean > > deps: > > dep: > > dep.1: > > clean: > > clean: > [echo] [karajan]: CLEAN > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > clean: > > clean: > [echo] [abstraction]: CLEAN > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > clean: > > clean: > [echo] [abstraction-common]: CLEAN > [delete] Deleting directory /gpfs/home/zzhang/cog/modules/abstraction-common/build > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > clean: > > clean: > [echo] [jglobus]: CLEAN > [delete] Deleting directory /gpfs/home/zzhang/cog/modules/jglobus/build > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > delete.dependency.log.1: > > dep: > > dep.1: > > clean: > > clean: > [echo] [util]: CLEAN > [delete] Deleting directory /gpfs/home/zzhang/cog/modules/util/build > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > delete.dependency.log.1: > > delete.dependency.log.1: > > dep: > > dep.1: > > clean: > > clean: > [echo] [provider-gt2]: CLEAN > [delete] Deleting directory /gpfs/home/zzhang/cog/modules/provider-gt2/build > > clean.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dep: > > dep.1: > > dep: > > dep.1: > > delete.dependency.log.1: > > dep: > > dep.1: > > BUILD FAILED > /gpfs/home/zzhang/cog/modules/vdsk/build.xml:189: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:312: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:316: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/vdsk/dependencies.xml:4: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/karajan/build.xml:78: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:312: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:316: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/karajan/dependencies.xml:4: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/abstraction/build.xml:77: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:312: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:316: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/abstraction/dependencies.xml:10: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: Basedir /gpfs/home/zzhang/cog/modules/provider-gt2ft does not exist > > Total time: 2 seconds > Buildfile: build.xml > > generateVersion: > > antlr: > [java] ANTLR Parser Generator Version 2.7.5 (20050128) 1989-2005 jGuru.com > [java] resources/swiftscript.g:936: warning:nondeterminism upon > [java] resources/swiftscript.g:936: k==1:LBRACK > [java] resources/swiftscript.g:936: k==2:ID,STRING_LITERAL,LBRACK,LPAREN,AT,PLUS,MINUS,STAR,NOT,INT_LITERAL,FLOAT_LITERAL,"true","false","null" > [java] resources/swiftscript.g:936: between alt 1 and exit branch of block > > compileSchema: > [java] Time to build schema type system: 1.676 seconds > [java] Time to generate code: 0.494 seconds > [java] Time to compile code: 5.0 seconds > [java] Compiled types to: /gpfs/home/zzhang/cog/modules/vdsk/../../modules/vdsk/lib/vdldefinitions.jar > > dist: > > dist: > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/etc > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/etc > > log4j.properties.update: > [concat] No existing resources and no nested text, doing nothing > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > [delete] Deleting: /gpfs/home/zzhang/cog/dependency.log.dist > > deps: > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > [concat] No existing resources and no nested text, doing nothing > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > [concat] No existing resources and no nested text, doing nothing > [java] Warning: source log (/gpfs/home/zzhang/cog/modules/jglobus/../../modules/jglobus/CHANGES.txt) does not exist > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > delete.dependency.log.1: > [echo] [jglobus]: DIST > [echo] [jglobus]: JARCOPY > [copy] Copying 11 files to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib > [copy] Copying 4 files to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib > > delete.jar: > [echo] [jglobus]: DELETE.JAR (cog-jglobus-dev-080204.jar) > [delete] Deleting: /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib/cog-jglobus-dev-080204.jar > > compile: > [echo] [jglobus]: COMPILE > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/jglobus/build > > copy.resources: > > jar: > [echo] [jglobus]: JAR (cog-jglobus-dev-080204.jar) > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/jglobus/build/etc.tmp > [jar] Warning: skipping jar archive /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib/cog-jglobus-dev-080204.jar because no files were included. > [jar] Building MANIFEST-only jar: /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib/cog-jglobus-dev-080204.jar > > create: > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-proxy-init > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-proxy-info > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-proxy-destroy > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-cert-info > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-info-search > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globusrun > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus-url-copy > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus-gass-server > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus-gass-server-shutdown > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus-personal-gatekeeper > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER grid-change-pass-phrase > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER cog-myproxy > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER globus2jks > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [jglobus]: LAUNCHER cog-proxy-init > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > deploy.examples: > > do.deploy.examples: > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > [concat] No existing resources and no nested text, doing nothing > [java] Warning: source log (/gpfs/home/zzhang/cog/modules/util/../../modules/util/CHANGES.txt) does not exist > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > delete.dependency.log.1: > [echo] [util]: DIST > [echo] [util]: JARCOPY > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib > > delete.jar: > [echo] [util]: DELETE.JAR (cog-util-0.92.jar) > > compile: > [echo] [util]: COMPILE > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/util/build > [javac] Compiling 47 source files to /gpfs/home/zzhang/cog/modules/util/build > [javac] Note: * uses or overrides a deprecated API. > [javac] Note: Recompile with -Xlint:deprecation for details. > > copy.resources: > > jar: > [echo] [util]: JAR (cog-util-0.92.jar) > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/util/build/etc.tmp > [jar] Building jar: /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib/cog-util-0.92.jar > > create: > > launcher: > > create.launcher: > [echo] [util]: LAUNCHER cog-register > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > deploy.examples: > > do.deploy.examples: > > delete.dependency.log.1: > [echo] [abstraction-common]: DIST > [echo] [abstraction-common]: JARCOPY > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib > > delete.jar: > [echo] [abstraction-common]: DELETE.JAR (cog-abstraction-common-2.2.jar) > > compile: > [echo] [abstraction-common]: COMPILE > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/abstraction-common/build > [javac] Compiling 149 source files to /gpfs/home/zzhang/cog/modules/abstraction-common/build > [javac] Note: * uses or overrides a deprecated API. > [javac] Note: Recompile with -Xlint:deprecation for details. > > copy.resources: > > jar: > [echo] [abstraction-common]: JAR (cog-abstraction-common-2.2.jar) > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/abstraction-common/build/etc.tmp > [jar] Building jar: /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib/cog-abstraction-common-2.2.jar > > create: > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cogrun > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-job-submit > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-file-operation > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-file-transfer > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > example.launcher: > > create.example.launcher: > [echo] [abstraction-common]: EXAMPLE LAUNCHER examples/hierarchical-queue-handler > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin/examples > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin/examples > > example.launcher: > > create.example.launcher: > [echo] [abstraction-common]: EXAMPLE LAUNCHER examples/hierarchical-set-handler > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin/examples > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin/examples > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-task2xml > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > example.launcher: > > create.example.launcher: > [echo] [abstraction-common]: EXAMPLE LAUNCHER examples/taskgraph-2-xml > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin/examples > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin/examples > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-checkpoint-submit > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-checkpoint-status > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > example.launcher: > > create.example.launcher: > [echo] [abstraction-common]: EXAMPLE LAUNCHER examples/xml-2-taskgraph > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin/examples > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin/examples > > launcher: > > create.launcher: > [echo] [abstraction-common]: LAUNCHER cog-info > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/bin > > deploy.examples: > > do.deploy.examples: > > dep: > > dep.1: > > dist: > > dist: > > log4j.properties: > > log4j.check.module: > > log4j.properties.init: > > log4j.properties.update: > > build.dependencies: > > dependencies: > > create.dependency.log: > > delete.dependency.log.1: > > deps: > > dep: > > dep.1: > > dep: > > dep.1: > > dep: > > dep.1: > > delete.dependency.log.1: > [echo] [provider-gt2]: DIST > [echo] [provider-gt2]: JARCOPY > > delete.jar: > [echo] [provider-gt2]: DELETE.JAR (cog-provider-gt2-2.3.jar) > > compile: > [echo] [provider-gt2]: COMPILE > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/provider-gt2/build > [javac] Compiling 21 source files to /gpfs/home/zzhang/cog/modules/provider-gt2/build > [javac] Note: /gpfs/home/zzhang/cog/modules/provider-gt2/src/org/globus/cog/abstraction/impl/file/ftp/CredentialsDialog.java uses or overrides a deprecated API. > [javac] Note: Recompile with -Xlint:deprecation for details. > > copy.resources: > [copy] Copying 1 file to /gpfs/home/zzhang/cog/modules/provider-gt2/build > > update.provider.props: > > jar: > [echo] [provider-gt2]: JAR (cog-provider-gt2-2.3.jar) > [mkdir] Created dir: /gpfs/home/zzhang/cog/modules/provider-gt2/build/etc.tmp > [jar] Building jar: /gpfs/home/zzhang/cog/modules/vdsk/dist/vdsk-0.3-dev/lib/cog-provider-gt2-2.3.jar > > create: > > deploy.examples: > > do.deploy.examples: > > dep: > > dep.1: > > BUILD FAILED > /gpfs/home/zzhang/cog/modules/vdsk/build.xml:73: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:442: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:78: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/vdsk/dependencies.xml:4: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/karajan/build.xml:59: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:442: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:78: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/karajan/dependencies.xml:4: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/abstraction/build.xml:58: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:442: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:78: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:51: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/modules/abstraction/dependencies.xml:10: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:162: The following error occurred while executing this line: > /gpfs/home/zzhang/cog/mbuild.xml:167: Basedir /gpfs/home/zzhang/cog/modules/provider-gt2ft does not exist > > Total time: 34 seconds > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From wilde at mcs.anl.gov Wed Feb 13 14:02:27 2008 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 13 Feb 2008 14:02:27 -0600 Subject: [Swift-user] Missing User Guide PDF on Swift web site Message-ID: <47B34CD3.1050109@mcs.anl.gov> Robert Tomek, from Ian's class, reports that the PDF Users Guide is missing (broken link). He wants to use Swift in a class project. From hategan at mcs.anl.gov Wed Feb 13 14:30:51 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 13 Feb 2008 14:30:51 -0600 Subject: [Swift-user] Missing User Guide PDF on Swift web site In-Reply-To: <47B34CD3.1050109@mcs.anl.gov> References: <47B34CD3.1050109@mcs.anl.gov> Message-ID: <1202934651.21546.1.camel@blabla.mcs.anl.gov> Hmm. It looks like compilation to pdf fails on the user guide. I'm not quite sure why. On Wed, 2008-02-13 at 14:02 -0600, Michael Wilde wrote: > Robert Tomek, from Ian's class, reports that the PDF Users Guide is > missing (broken link). > > He wants to use Swift in a class project. > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From hategan at mcs.anl.gov Wed Feb 13 14:55:28 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 13 Feb 2008 14:55:28 -0600 Subject: [Swift-user] Missing User Guide PDF on Swift web site In-Reply-To: <1202934651.21546.1.camel@blabla.mcs.anl.gov> References: <47B34CD3.1050109@mcs.anl.gov> <1202934651.21546.1.camel@blabla.mcs.anl.gov> Message-ID: <1202936128.21546.5.camel@blabla.mcs.anl.gov> On Wed, 2008-02-13 at 14:30 -0600, Mihael Hategan wrote: > Hmm. It looks like compilation to pdf fails on the user guide. I'm not > quite sure why. Now I do. And I fixed it. > > On Wed, 2008-02-13 at 14:02 -0600, Michael Wilde wrote: > > Robert Tomek, from Ian's class, reports that the PDF Users Guide is > > missing (broken link). > > > > He wants to use Swift in a class project. > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From lixi at uchicago.edu Tue Feb 19 15:32:34 2008 From: lixi at uchicago.edu (lixi at uchicago.edu) Date: Tue, 19 Feb 2008 15:32:34 -0600 (CST) Subject: [Swift-user] Swift running errors Message-ID: <20080219153234.AWQ26226@m4500-03.uchicago.edu> Hi, I have two problems. 1. Today, when I try to run swift workflow on muliple OSG sites, I always encounter the following errors which cause the running failed: [lixi at login remote]$ swift - tc.file /home/lixi/swift/test/tc.data - sites.file /home/lixi/swift/test/OSGEDU_Sites.xml workflowtest.swift Swift v0.3-dev r1674 (modified locally) RunID: 20080219-1447-1hztqje9 node started Failed to transfer kickstart records from workflowtest- 20080219-1447-1hztqje9/kickstart/8/CIT_CMS_T2Exception in getFile task:transfer @ vdl-int.k, line: 322 sys:try @ vdl-int.k, line: 322 vdl:transferkickstartrec @ vdl-int.k, line: 409 sys:set @ vdl-int.k, line: 409 sys:sequential @ vdl-int.k, line: 409 sys:try @ vdl-int.k, line: 408 sys:else @ vdl-int.k, line: 407 sys:if @ vdl-int.k, line: 405 sys:set @ vdl-int.k, line: 404 sys:catch @ vdl-int.k, line: 396 sys:try @ vdl-int.k, line: 354 task:allocatehost @ vdl-int.k, line: 334 vdl:execute2 @ execute-default.k, line: 23 sys:restartonerror @ execute-default.k, line: 21 sys:sequential @ execute-default.k, line: 19 sys:try @ execute-default.k, line: 18 sys:if @ execute-default.k, line: 17 sys:then @ execute-default.k, line: 16 sys:if @ execute-default.k, line: 15 vdl:execute @ workflowtest.kml, line: 31 worknode @ workflowtest.kml, line: 79 sys:sequential @ workflowtest.kml, line: 78 sys:parallel @ workflowtest.kml, line: 77 vdl:mainp @ workflowtest.kml, line: 76 mainp @ vdl.k, line: 150 vdl:mains @ workflowtest.kml, line: 75 vdl:mains @ workflowtest.kml, line: 75 rlog:restartlog @ workflowtest.kml, line: 74 kernel:project @ workflowtest.kml, line: 2 workflowtest-20080219-1447-1hztqje9 Caused by: org.globus.cog.abstraction.impl.file.FileResourceException: Exception in getFile Caused by: org.globus.ftp.exception.ServerException: Server refused performing the request. Custom message: (error code 1) [Nested exception message: Custom message: Unexpected reply: 500-Command failed. : globus_gridftp_server_file.c:globus_l_gfs_file_send:2190: 500-globus_l_gfs_file_open failed. 500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694: 500-globus_xio_register_open failed. 500-globus_xio_file_driver.c:globus_l_xio_file_open:438: 500-Unable to open file /raid2/osg-data/lixi/workflowtest- 20080219-1447-1hztqje9/kickstart/8/node-8kgjdnoi- kickstart.xml 500-globus_xio_file_driver.c:globus_l_xio_file_open:381: 500-System error in open: No such file or directory 500-globus_xio: A system call failed: No such file or directory 500 End.] [Nested exception is org.globus.ftp.exception.UnexpectedReplyCodeException: Custom message: Unexpected reply: 500-Command failed. : globus_gridftp_server_file.c:globus_l_gfs_file_send:2190: 500-globus_l_gfs_file_open failed. 500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694: 500-globus_xio_register_open failed. 500-globus_xio_file_driver.c:globus_l_xio_file_open:438: 500-Unable to open file /raid2/osg-data/lixi/workflowtest- 20080219-1447-1hztqje9/kickstart/8/node-8kgjdnoi- kickstart.xml 500-globus_xio_file_driver.c:globus_l_xio_file_open:381: 500-System error in open: No such file or directory 500-globus_xio: A system call failed: No such file or directory 500 End.] 2. When runing a workflow which involves 1000nodes, I encounter the following errors very frequently, but not all the time: ... node completed node completed node completed node completed node completed node failed Execution failed: Exception in node: Arguments: [_concurrent/intermediatefile-b5b5dc39-df70-4137- 8149-c20f5d1af839-, out.0132.txt] Host: localhost Directory: workflowtest-20080219-1443-2qx4ctkc/jobs/6/node- 64kddnoi stderr.txt: stdout.txt: ---- Caused by: java.io.IOException: Too many open files Could you tell me why and teach me how to resolve such problems? Thanks, Xi From benc at hawaga.org.uk Wed Feb 20 05:41:16 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 20 Feb 2008 11:41:16 +0000 (GMT) Subject: [Swift-user] Swift running errors In-Reply-To: <20080219153234.AWQ26226@m4500-03.uchicago.edu> References: <20080219153234.AWQ26226@m4500-03.uchicago.edu> Message-ID: On Tue, 19 Feb 2008, lixi at uchicago.edu wrote: > Failed to transfer kickstart records from workflowtest- > 20080219-1447-1hztqje9/kickstart/8/CIT_CMS_T2Exception in > getFile Sometimes this happens because there was an error running you job for some other reason (so the job didn't run, a kickstart record wasn't generated and it couldn't be transfered). Have a look in the log file for an error earlier than that (or put the log files online some so I can look). > 2. When runing a workflow which involves 1000nodes, I > encounter the following errors very frequently, but not all > the time: [..] > java.io.IOException: Too many open files What machine are you running on? Have you changed any configuration parameters? A basic 1000 job workflow should run fine with the default settings. On the machine you are running on, type ulimit -a and paste that here. -- From hategan at mcs.anl.gov Wed Feb 20 09:42:12 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 20 Feb 2008 09:42:12 -0600 Subject: [Swift-user] Swift running errors In-Reply-To: References: <20080219153234.AWQ26226@m4500-03.uchicago.edu> Message-ID: <1203522133.2859.2.camel@blabla.mcs.anl.gov> > > 2. When runing a workflow which involves 1000nodes, I > > encounter the following errors very frequently, but not all > > the time: > [..] > > java.io.IOException: Too many open files > > What machine are you running on? Have you changed any configuration > parameters? A basic 1000 job workflow should run fine with the default > settings. > > On the machine you are running on, type ulimit -a and paste that here. > There is some odd bug in the JVM. Some, and I don't remember exactly which, classes related to time/date open time zone resource files repeatedly and they forget to close them. If this is done fast enough (faster than the garbage collector reaches those files which get closed when finalized), you get this error. It might be something else though. From lixi at uchicago.edu Wed Feb 20 09:58:56 2008 From: lixi at uchicago.edu (lixi at uchicago.edu) Date: Wed, 20 Feb 2008 09:58:56 -0600 (CST) Subject: [Swift-user] Swift running errors Message-ID: <20080220095856.AWR04393@m4500-03.uchicago.edu> >> Failed to transfer kickstart records from workflowtest- >> 20080219-1447-1hztqje9/kickstart/8/CIT_CMS_T2Exception in >> getFile >Sometimes this happens because there was an error running you job for some >other reason (so the job didn't run, a kickstart record wasn't generated >and it couldn't be transfered). Have a look in the log file for an error >earlier than that (or put the log files online some so I can look). This log file is in terminable.ci.uchicago.edu: /home/lixi/swift/test/newtest/1000nodes/remote/workflowtest- 20080219-1447-1hztqje9.log >> 2. When runing a workflow which involves 1000nodes, I >> encounter the following errors very frequently, but not all >> the time: [..] >> java.io.IOException: Too many open files >What machine are you running on? Have you changed any configuration >parameters? A basic 1000 job workflow should run fine with the default >settings. >On the machine you are running on, type ulimit -a and paste that here. Yesterday I was running that on login.ci.uchicago.edu, but it crashed last night, so I tried ulimit -a on terminable.ci.uchicago.edu and got: [lixi at terminable ~]$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited pending signals (-i) 1024 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 15871 virtual memory (kbytes, -v) unlimited Does it mean that I can only at most open 1024 files at one time? Thanks, Xi From benc at hawaga.org.uk Wed Feb 20 10:02:03 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 20 Feb 2008 16:02:03 +0000 (GMT) Subject: [Swift-user] Swift running errors In-Reply-To: <20080220095856.AWR04393@m4500-03.uchicago.edu> References: <20080220095856.AWR04393@m4500-03.uchicago.edu> Message-ID: On Wed, 20 Feb 2008, lixi at uchicago.edu wrote: > >On the machine you are running on, type ulimit -a and > paste that here. > > Yesterday I was running that on login.ci.uchicago.edu, but > it crashed last night, so I tried ulimit -a on > terminable.ci.uchicago.edu and got: Does this problem happen for you on terminable? Also as login.ci was being unreliable, I'd try again to see if you still have the problem. -- From hategan at mcs.anl.gov Wed Feb 20 10:04:30 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 20 Feb 2008 10:04:30 -0600 Subject: [Swift-user] Swift running errors In-Reply-To: <20080220095856.AWR04393@m4500-03.uchicago.edu> References: <20080220095856.AWR04393@m4500-03.uchicago.edu> Message-ID: <1203523470.3454.0.camel@blabla.mcs.anl.gov> On Wed, 2008-02-20 at 09:58 -0600, lixi at uchicago.edu wrote: > This log file is in terminable.ci.uchicago.edu: > /home/lixi/swift/test/newtest/1000nodes/remote/workflowtest- > 20080219-1447-1hztqje9.log > I don't see any IOException in that log file. From lixi at uchicago.edu Wed Feb 20 10:08:31 2008 From: lixi at uchicago.edu (lixi at uchicago.edu) Date: Wed, 20 Feb 2008 10:08:31 -0600 (CST) Subject: [Swift-user] Swift running errors Message-ID: <20080220100831.AWR06340@m4500-03.uchicago.edu> This log file just corresponds to the first problem which is concerned with node execution error. From benc at hawaga.org.uk Wed Feb 20 10:15:07 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 20 Feb 2008 16:15:07 +0000 (GMT) Subject: [Swift-user] Swift running errors In-Reply-To: <20080220095856.AWR04393@m4500-03.uchicago.edu> References: <20080220095856.AWR04393@m4500-03.uchicago.edu> Message-ID: On Wed, 20 Feb 2008, lixi at uchicago.edu wrote: > >Sometimes this happens because there was an error running > you job for some > >other reason (so the job didn't run, a kickstart record > wasn't generated > >and it couldn't be transfered). Have a look in the log file There are some other problems trying to run that workshop, in the log file. It looks like no jobs run properly. Can you run the example hello world workflow against that site? (in examples/vdsk/first.swift) Also, please send your sites.xml and tc.data settings for that site. -- From benc at hawaga.org.uk Wed Feb 20 11:27:45 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 20 Feb 2008 17:27:45 +0000 (GMT) Subject: [Swift-user] Swift running errors In-Reply-To: <20080220102647.AWR09062@m4500-03.uchicago.edu> References: <20080220102647.AWR09062@m4500-03.uchicago.edu> Message-ID: On Wed, 20 Feb 2008, lixi at uchicago.edu wrote: > Execution failed: > 'vdl:pre' is not defined. This happens because you have built a newer version of swift, and one of the intermediate files (the .kml file) needs to be rebuild. Touch the .swift file with: touch first.swift and then run swift to make this go away. -- From iraicu at cs.uchicago.edu Wed Feb 20 12:32:32 2008 From: iraicu at cs.uchicago.edu (Ioan Raicu) Date: Wed, 20 Feb 2008 12:32:32 -0600 Subject: [Swift-user] Swift running errors In-Reply-To: <1203522133.2859.2.camel@blabla.mcs.anl.gov> References: <20080219153234.AWQ26226@m4500-03.uchicago.edu> <1203522133.2859.2.camel@blabla.mcs.anl.gov> Message-ID: <47BC7240.9080502@cs.uchicago.edu> I doubt its a bug in the JVM, its probably the application not closing all the file/stream handles. For example, I ran into this problem when I was using: > Process child = Runtime.getRuntime().exec(command); but not consuming and closing the stdout and stderr streams which get created automatically when the child process gets created. Simply setting child to null after the process exited did not close the streams. This quickly lead to too many open file handles after a few hundred exec() calls. I had to make a wrapper class of the Process class that took care of consuming and closing the output streams by default, and my problems went away. Just another idea of where to look for potential problems in the code. Ioan Mihael Hategan wrote: >>> 2. When runing a workflow which involves 1000nodes, I >>> encounter the following errors very frequently, but not all >>> the time: >>> >> [..] >> >>> java.io.IOException: Too many open files >>> >> What machine are you running on? Have you changed any configuration >> parameters? A basic 1000 job workflow should run fine with the default >> settings. >> >> On the machine you are running on, type ulimit -a and paste that here. >> >> > > There is some odd bug in the JVM. Some, and I don't remember exactly > which, classes related to time/date open time zone resource files > repeatedly and they forget to close them. If this is done fast enough > (faster than the garbage collector reaches those files which get closed > when finalized), you get this error. > > It might be something else though. > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > -- ================================================== Ioan Raicu Ph.D. Candidate ================================================== Distributed Systems Laboratory Computer Science Department University of Chicago 1100 E. 58th Street, Ryerson Hall Chicago, IL 60637 ================================================== Email: iraicu at cs.uchicago.edu Web: http://www.cs.uchicago.edu/~iraicu http://dev.globus.org/wiki/Incubator/Falkon http://www.ci.uchicago.edu/wiki/bin/view/VDS/DslCS ================================================== ================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Wed Feb 20 13:21:41 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 20 Feb 2008 13:21:41 -0600 Subject: [Swift-user] Swift running errors In-Reply-To: <47BC7240.9080502@cs.uchicago.edu> References: <20080219153234.AWQ26226@m4500-03.uchicago.edu> <1203522133.2859.2.camel@blabla.mcs.anl.gov> <47BC7240.9080502@cs.uchicago.edu> Message-ID: <1203535301.7436.8.camel@blabla.mcs.anl.gov> On Wed, 2008-02-20 at 12:32 -0600, Ioan Raicu wrote: > I doubt its a bug in the JVM, its probably the application not closing > all the file/stream handles. Hard to miss an opportunity to argue, isn't it? Well, my debugger disagrees with you. And I'll trust the debugger on this one. > For example, I ran into this problem when I was using: > > Process child = Runtime.getRuntime().exec(command); > but not consuming and closing the stdout and stderr streams which get > created automatically when the child process gets created. Simply > setting child to null after the process exited did not close the > streams. Excellent. You've identified a similar problem. Eventually files get closed when you set things to null, but only when the garbage collector gets to those objects and finalizes them. > This quickly lead to too many open file handles after a few hundred > exec() calls. I had to make a wrapper class of the Process class that > took care of consuming and closing the output streams by default, and > my problems went away. > > Just another idea of where to look for potential problems in the code. > > Ioan > > Mihael Hategan wrote: > > > > 2. When runing a workflow which involves 1000nodes, I > > > > encounter the following errors very frequently, but not all > > > > the time: > > > > > > > [..] > > > > > > > java.io.IOException: Too many open files > > > > > > > What machine are you running on? Have you changed any configuration > > > parameters? A basic 1000 job workflow should run fine with the default > > > settings. > > > > > > On the machine you are running on, type ulimit -a and paste that here. > > > > > > > > > > There is some odd bug in the JVM. Some, and I don't remember exactly > > which, classes related to time/date open time zone resource files > > repeatedly and they forget to close them. If this is done fast enough > > (faster than the garbage collector reaches those files which get closed > > when finalized), you get this error. > > > > It might be something else though. > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > -- > ================================================== > Ioan Raicu > Ph.D. Candidate > ================================================== > Distributed Systems Laboratory > Computer Science Department > University of Chicago > 1100 E. 58th Street, Ryerson Hall > Chicago, IL 60637 > ================================================== > Email: iraicu at cs.uchicago.edu > Web: http://www.cs.uchicago.edu/~iraicu > http://dev.globus.org/wiki/Incubator/Falkon > http://www.ci.uchicago.edu/wiki/bin/view/VDS/DslCS > ================================================== > ================================================== > From quanpt at cs.uchicago.edu Thu Feb 21 23:42:06 2008 From: quanpt at cs.uchicago.edu (Quan Tran Pham) Date: Thu, 21 Feb 2008 23:42:06 -0600 Subject: [Swift-user] Error: "Missing -if argument" Message-ID: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> Dear all, My simple wc swift script is below, and I keep getting the error: "Missing -if argument", can anyone help me with that? type inputfile {}; type outputfile {}; (outputfile o) fileWc(inputfile file) { app { wc "-w" @filename(file) stdout=@filename(o); } } (outputfile ofiles[]) mainLoop(inputfile files[]) { foreach file,i in files { ofiles[i]=fileWc(file); } } inputfile files[]; outputfile ofiles[]; ofiles=mainLoop(files); ====================== Also, I have a question: if I want to use csplit to split one big file to a number of small file: /usr/bin/csplit -f myprefix mybigfile 1000 {*} which basically splits mybigfile into many small file myprefix01, myprefix02, ..., myprefix99. How can I do that? I try (outfile outf[]) split(infile f) { app {....} } infile f<"foo.txt">; outfile outfile[] ; outfile=split(f); but does not work. I figure the split part can be done at one node, hence can be done outside swift, but there should be some other ways? Thank you very much Quan Pham -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Fri Feb 22 06:14:18 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 22 Feb 2008 12:14:18 +0000 (GMT) Subject: [Swift-user] Error: "Missing -if argument" In-Reply-To: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> References: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> Message-ID: On Thu, 21 Feb 2008, Quan Tran Pham wrote: > My simple wc swift script is below, and I keep getting the error: "Missing > -if argument", can anyone help me with that? Have you successfully run any other SwiftScripts with the same install/version and on the same site? What version are you using? Are you running locally or sending to a remote site? What do you see from ls -l ./data/ Please send a log file from such a run. -- From benc at hawaga.org.uk Fri Feb 22 06:16:28 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 22 Feb 2008 12:16:28 +0000 (GMT) Subject: [Swift-user] Error: "Missing -if argument" In-Reply-To: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> References: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> Message-ID: On Thu, 21 Feb 2008, Quan Tran Pham wrote: > Also, I have a question: if I want to use csplit to split one big file to a > number of small file: > /usr/bin/csplit -f myprefix mybigfile 1000 {*} > which basically splits mybigfile into many small file myprefix01, > myprefix02, ..., myprefix99. > How can I do that? I try At the moment, Swift can't deal with that because you don't know the exact names of the output files before execution. This is something that has been asked for before but it has not been implemented. -- From benc at hawaga.org.uk Fri Feb 22 06:35:44 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 22 Feb 2008 12:35:44 +0000 (GMT) Subject: [Swift-user] Error: "Missing -if argument" In-Reply-To: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> References: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> Message-ID: On Thu, 21 Feb 2008, Quan Tran Pham wrote: > which basically splits mybigfile into many small file myprefix01, > myprefix02, ..., myprefix99. are you expecting exactly 99 files back, or variable numbers? (I think variable numbers, given how csplit works) but I would like to check. -- From wilde at mcs.anl.gov Fri Feb 22 07:05:44 2008 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 22 Feb 2008 07:05:44 -0600 Subject: [Swift-user] Error: "Missing -if argument" In-Reply-To: References: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> Message-ID: <47BEC8A8.3080808@mcs.anl.gov> Quan, I think for now, the best approach is to wrap any tool or script that takes in or puts out files of unknown names or quantities with a script that puts out files of names that can be known before execution. I'll be at the CI after 10AM and we can meet to work on the Swift code if that would be helpful. - Mike On 2/22/08 6:35 AM, Ben Clifford wrote: > On Thu, 21 Feb 2008, Quan Tran Pham wrote: > >> which basically splits mybigfile into many small file myprefix01, >> myprefix02, ..., myprefix99. > > are you expecting exactly 99 files back, or variable numbers? (I think > variable numbers, given how csplit works) but I would like to check. > From quanpt at cs.uchicago.edu Fri Feb 22 09:07:28 2008 From: quanpt at cs.uchicago.edu (Quan Tran Pham) Date: Fri, 22 Feb 2008 09:07:28 -0600 Subject: [Swift-user] Error: "Missing -if argument" In-Reply-To: References: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> Message-ID: <4290b6c60802220707g7bdae794w2c13033303bf443a@mail.gmail.com> Hi Ben, Instead of this outputfile ofiles[]; foreach file,i in files { ofiles[i]=fileWc(file); } I use this foreach loop which works correctly for me (but I don't get the output array, only one at a time, still ok anyways) foreach f in files { countfile c; c = fileWc(f); } I am using the 2nd loop, hence I don't have the log for previous one. I can recreate if you need? About csplit, yes, variable number of files, not only 99 as in my example. I think I can use Mike's suggestion: wrap csplit with an ls, get the output and load into another mapper. That should work. Thank you very much Regards Quan Pham On Fri, Feb 22, 2008 at 6:14 AM, Ben Clifford wrote: > > > On Thu, 21 Feb 2008, Quan Tran Pham wrote: > > > My simple wc swift script is below, and I keep getting the error: > "Missing > > -if argument", can anyone help me with that? > > Have you successfully run any other SwiftScripts with the same > install/version and on the same site? > > What version are you using? > > Are you running locally or sending to a remote site? > > What do you see from ls -l ./data/ > > Please send a log file from such a run. > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Fri Feb 22 10:06:23 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 22 Feb 2008 16:06:23 +0000 (GMT) Subject: [Swift-user] Error: "Missing -if argument" In-Reply-To: <4290b6c60802220707g7bdae794w2c13033303bf443a@mail.gmail.com> References: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> <4290b6c60802220707g7bdae794w2c13033303bf443a@mail.gmail.com> Message-ID: On Fri, 22 Feb 2008, Quan Tran Pham wrote: > > I am using the 2nd loop, hence I don't have the log for previous one. I can > recreate if you need? yes, I would be very interested in you being able to recreate the problem (and then give the information about the run that I asked for in the previous email) as it is an error I have not seen much / at all before. -- From hategan at mcs.anl.gov Fri Feb 22 10:11:21 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 22 Feb 2008 10:11:21 -0600 Subject: [Swift-user] Error: "Missing -if argument" In-Reply-To: References: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> <4290b6c60802220707g7bdae794w2c13033303bf443a@mail.gmail.com> Message-ID: <1203696682.893.1.camel@blabla.mcs.anl.gov> On Fri, 2008-02-22 at 16:06 +0000, Ben Clifford wrote: > > On Fri, 22 Feb 2008, Quan Tran Pham wrote: > > > > > I am using the 2nd loop, hence I don't have the log for previous one. I can > > recreate if you need? > > yes, I would be very interested in you being able to recreate the problem > (and then give the information about the run that I asked for in the > previous email) as it is an error I have not seen much / at all before. The wrapper seems to be the one reporting it. -if is the list of input files. > From benc at hawaga.org.uk Fri Feb 22 10:18:41 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 22 Feb 2008 16:18:41 +0000 (GMT) Subject: [Swift-user] Error: "Missing -if argument" In-Reply-To: <1203696682.893.1.camel@blabla.mcs.anl.gov> References: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> <4290b6c60802220707g7bdae794w2c13033303bf443a@mail.gmail.com> <1203696682.893.1.camel@blabla.mcs.anl.gov> Message-ID: On Fri, 22 Feb 2008, Mihael Hategan wrote: > The wrapper seems to be the one reporting it. -if is the list of input > files. Right, I know that much... -- From quanpt at cs.uchicago.edu Fri Feb 22 11:45:02 2008 From: quanpt at cs.uchicago.edu (Quan Tran Pham) Date: Fri, 22 Feb 2008 11:45:02 -0600 Subject: [Swift-user] Error: "Missing -if argument" In-Reply-To: References: <4290b6c60802212142j11486a6etcf2de22bc3e48b4b@mail.gmail.com> <4290b6c60802220707g7bdae794w2c13033303bf443a@mail.gmail.com> <1203696682.893.1.camel@blabla.mcs.anl.gov> Message-ID: <4290b6c60802220945n5f356c72j6de6240f54656eab@mail.gmail.com> This time, the error from swift command is "Caused by: No status file was found. Check the shared filesystem on localhost" But you can still find the "Missing -if argument" in the log file. Thanks Quan Pham On Fri, Feb 22, 2008 at 10:18 AM, Ben Clifford wrote: > > On Fri, 22 Feb 2008, Mihael Hategan wrote: > > > The wrapper seems to be the one reporting it. -if is the list of input > > files. > > Right, I know that much... > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 02.sort-20080222-1135-24yc9v3e.log Type: application/octet-stream Size: 45242 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 02.sort.swift Type: application/octet-stream Size: 433 bytes Desc: not available URL: