<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
great!<br>
<br>
Jing Tie wrote:
<blockquote
cite="mid:ce8053b90709100926lba7402ey38d5339875b5aad1@mail.gmail.com"
type="cite">Hi,<br>
<br>
It works fine now (log is attached). I'll try sid program next.<br>
<br>
Many thanks,<br>
Jing<br>
<br>
On 9/10/07, Mihael Hategan <<a moz-do-not-send="true"
href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>> wrote:<br>
> You need to do an SVN update for both CoG and Swift.
<br>
> <br>
> On Mon, 2007-09-10 at 11:03 -0500, Jing Tie wrote:<br>
> > Hi,<br>
> ><br>
> > It has the same exception. Log is attached.<br>
> ><br>
> > Thanks,<br>
> > Jing<br>
> ><br>
> > On 9/10/07, Mihael Hategan <<a moz-do-not-send="true"
href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>> wrote:<br>
> > > On Sun, 2007-09-09 at 12:29 -0500, Mihael Hategan wrote:<br>
> > > > It's definitely a bug in the code I put in a few
days ago. But I don't
<br>
> > > > quite see how it happens. Such simple code. Yet how
complex. I'll have<br>
> > > > to get back to you on it.<br>
> > ><br>
> > > Fixed, I think. Can you try again?<br>
> > ><br>
> > > Mihael<br>
> > ><br>
> > > ><br>
> > > > On Sun, 2007-09-09 at 12:09 -0500, Jing Tie wrote:<br>
> > > > > Sure.<br>
> > > > ><br>
> > > > > On 9/9/07, Mihael Hategan <
<a moz-do-not-send="true" href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>>
wrote:<br>
> > > > > > Please post the whole workflow and the
whole log.<br>
> > > > > ><br>
> > > > > > On Sun, 2007-09-09 at 11:43 -0500, Jing
Tie wrote:
<br>
> > > > > > > Hi,<br>
> > > > > > ><br>
> > > > > > > I tried duplicate job again using
the latest swift with<br>
> > > > > > >
"log4j.logger.org.globus.cog.abstraction=DEBUG
".<br>
> > > > > > ><br>
> > > > > > > It didn't generated duplicate-***
directory under simple-wf-***<br>
> > > > > > > directory. Details:<br>
> > > > > > > Resource (
org.globus.cog.abstraction.impl.file.gridftp.FileResourceImpl@cbc2d3)<br>
> > > > > > > successfully released<br>
> > > > > > > Task(type=4,
identity=urn:0-0-1189348296362) setting status to Failed
<br>
> > > > > > > Could not set current directory to
"null"<br>
> > > > > > > duplicate failed<br>
> > > > > > > The following errors have occurred:<br>
> > > > > > > 1. Could not initialize shared
directory on NWICG_NotreDame<br>
> > > > > > > Caused by:<br>
> > > > > > > Could not set current
directory to "null"
<br>
> > > > > > > Caused by:<br>
> > > > > > > Required argument missing<br>
> > > > > > ><br>
> > > > > > > in simple-wf-fc06kzz28d880.log
:<br>
> > > > > > > 2007-09-09 09:31:39,911 DEBUG
FileResourceCache Resource<br>
> > > > > > >
(org.globus.cog.abstraction.impl.file.gridftp.FileResourceImpl@cbc2d3)<br>
> > > > > > > successfully released
<br>
> > > > > > > 2007-09-09 09:31:39,911 DEBUG
TaskImpl Task(type=4,<br>
> > > > > > > identity=urn:0-0-1189348296362)
setting status to Failed Could not set<br>
> > > > > > > current directory to "null"
<br>
> > > > > > > 2007-09-09 09:31:39,938
INFO vdl:mains Errors detected. Cleanup not done.<br>
> > > > > > > 2007-09-09 09:31:39,963 DEBUG
VDL2ExecutionContext Execution completed
<br>
> > > > > > > with errors<br>
> > > > > > > Execution completed with errors<br>
> > > > > > ><br>
> > > > > > > There is nothing under<br>
> > > > > > > <a moz-do-not-send="true"
href="http://osg.hpcc.nd.edu/dscratch/osg/data/osg/jtie/simple-wf-fc06kzz28d880">osg.hpcc.nd.edu/dscratch/osg/data/osg/jtie/simple-wf-fc06kzz28d880</a><br>
> > > > > > > except empty shared directory.
<br>
> > > > > > ><br>
> > > > > > > Thanks,<br>
> > > > > > > Jing<br>
> > > > > > ><br>
> > > > > > > On 9/7/07, Mihael Hategan <
<a moz-do-not-send="true" href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>>
wrote:<br>
> > > > > > > > Nevermind that. It's eating the
empty stdin argument. This doesn't make<br>
> > > > > > > > sense. Fork used to behave.
<br>
> > > > > > > ><br>
> > > > > > > > Can you add this to
log4j.properties, run again, and post the log?<br>
> > > > > > > ><br>
> > > > > > > >
log4j.logger.org.globus.cog.abstraction=DEBUG<br>
> > > > > > > ><br>
> > > > > > > ><br>
> > > > > > > > On Fri, 2007-09-07 at 23:02
-0500, Mihael Hategan wrote:
<br>
> > > > > > > > > Can you post the workflow?<br>
> > > > > > > > ><br>
> > > > > > > > > On Fri, 2007-09-07 at
21:56 -0500, Jing Tie wrote:
<br>
> > > > > > > > > > Thanks!<br>
> > > > > > > > > ><br>
> > > > > > > > > > I checked the
procedure, and found that the filenames of the output
<br>
> > > > > > > > > > files of "myapp" are
the same as "File f". So the first option should<br>
> > > > > > > > > > not be the problem.<br>
> > > > > > > > > >
<br>
> > > > > > > > > > Then I tried a very
simple swift script that doesn't need R:<br>
> > > > > > > > > > app function: add
some lines to the input file to generate the output file
<br>
> > > > > > > > > > input file name:
simpleFile.txt<br>
> > > > > > > > > > output file name:
simpleFile.output<br>
> > > > > > > > > > application script
location: $OSG_APP/osg/jtie/duplicate.sh
<br>
> > > > > > > > > > jobmanager:
jobmanager-fork<br>
> > > > > > > > > ><br>
> > > > > > > > > > duplicate failed<br>
> > > > > > > > > > The following errors
have occurred:
<br>
> > > > > > > > > > 1. Application
"duplicate" failed (Failed to link input file<br>
> > > > > > > > > >
/dscratch/osg/app/osg/jtie/duplicate.sh)<br>
> > > > > > > > > > Arguments:
"simpleFile.txt"<br>
> > > > > > > > > > Host:
NWICG_NotreDame<br>
> > > > > > > > > > Directory:
simple-wf-xgh9e9q1z0af2/duplicate-it7hyvgi
<br>
> > > > > > > > > > STDERR:<br>
> > > > > > > > > > STDOUT:<br>
> > > > > > > > > ><br>
> > > > > > > > > > after execution, 3
empty directories (not files, and also weird name)
<br>
> > > > > > > > > > were generated:<br>
> > > > > > > > > >
duplicate-gt7hyvgi-simpleFile.output<br>
> > > > > > > > > >
duplicate-ht7hyvgi-simpleFile.output
<br>
> > > > > > > > > >
duplicate-it7hyvgi-simpleFile.output<br>
> > > > > > > > > ><br>
> > > > > > > > > > wrapper.log:<br>
> > > > > > > > > >
DIR=duplicate-gt7hyvgi
<br>
> > > > > > > > > >
STDOUT=simpleFile.output<br>
> > > > > > > > > > STDERR=stderr.txt<br>
> > > > > > > > > >
DIRS=simpleFile.output
<br>
> > > > > > > > > >
LINKS=/dscratch/osg/app/osg/jtie/duplicate.sh<br>
> > > > > > > > > > OUTS=simpleFile.txt<br>
> > > > > > > > > > ln: creating symbolic
link
<br>
> > > > > > > > > >
`duplicate-gt7hyvgi//dscratch/osg/app/osg/jtie/duplicate.sh' to<br>
> > > > > > > > > >
`/dscratch/osg/data/osg/jtie/simple-wf-xgh9e9q1z0af2/shared//dscratch/osg/app/osg/jtie/duplicate.sh':
<br>
> > > > > > > > > > No such file or
directory<br>
> > > > > > > > > ><br>
> > > > > > > > > > under dir
/dscratch/osg/data/osg/jtie/simple-wf-xgh9e9q1z0af2/shared:
<br>
> > > > > > > > > > seq.sh,
simpleFile.txt, wrapper.sh<br>
> > > > > > > > > > under dir
/dscratch/osg/data/osg/jtie/simple-wf-xgh9e9q1z0af2/duplicate-it7hyvgi:
<br>
> > > > > > > > > > one directory -
simpleFile.output<br>
> > > > > > > > > >
/dscratch/osg/data/osg/jtie/simple-wf-xgh9e9q1z0af2/duplicate-it7hyvgi/simpleFile.output:
<br>
> > > > > > > > > > empty<br>
> > > > > > > > > ><br>
> > > > > > > > > > I think the swift
program is right, since I run it successfully using
<br>
> > > > > > > > > > localhost
duplicate.sh.<br>
> > > > > > > > > ><br>
> > > > > > > > > ><br>
> > > > > > > > > > Many thanks,
<br>
> > > > > > > > > > Jing<br>
> > > > > > > > > ><br>
> > > > > > > > > ><br>
> > > > > > > > > > On 9/7/07, Mihael
Hategan <
<a moz-do-not-send="true" href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>>
wrote:<br>
> > > > > > > > > > > So when the
output files are not created, there can be two reasons:<br>
> > > > > > > > > > > 1. The
specification of what files should be created is broken. This is,
<br>
> > > > > > > > > > > at this time,
done by looking at the filenames of the return values from<br>
> > > > > > > > > > > the atomic
procedure. Normally one passes those file names to the
<br>
> > > > > > > > > > > application as
output file parameters. Example:<br>
> > > > > > > > > > ><br>
> > > > > > > > > > > (File f)
proc(...) {
<br>
> > > > > > > > > > > app {<br>
> > > > > > > > > > > myapp ...
"-o" @filename(f);<br>
> > > > > > > > > > > }
<br>
> > > > > > > > > > > }<br>
> > > > > > > > > > ><br>
> > > > > > > > > > > 2. The
specification is correct, but the application doesn't behave.
<br>
> > > > > > > > > > ><br>
> > > > > > > > > > > Mihael<br>
> > > > > > > > > > ><br>
> > > > > > > > > > > On Fri,
2007-09-07 at 12:35 -0500, Jing Tie wrote:
<br>
> > > > > > > > > > > > Hi,<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > > I tried
jobmanger-fork instead of jobmanager-condor on <a
moz-do-not-send="true" href="http://osg.hpcc.nd.edu">osg.hpcc.nd.edu</a>
site:<br>
> > > > > > > > > > > > <recall
the site used to have the exception - kickstart executable<br>
> > > > > > > > > > > >
(101-FBchannel18_cwt-
avgResults.Rdata) not found><br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > >
jobmanager-fork:<br>
> > > > > > > > > > > >
------------------------
<br>
> > > > > > > > > > > > Application
exception: The following output files were not created by<br>
> > > > > > > > > > > > the
application: /dscratch/osg/app/osg/jtie/SIDGrid/wavelet.sh
<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > >
globus-job-run <a moz-do-not-send="true"
href="http://osg.hpcc.nd.edu/jobmanager">osg.hpcc.nd.edu/jobmanager</a>
/bin/ls -al
<br>
> > > > > > > > > > > >
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/cwtsmall-0bhnnvgi<br>
> > > > > > > > > > > >
lrwxrwxrwx 1 osg osgusers 93 Sep 7 12:42
<br>
> > > > > > > > > > > >
101-FBchannel10_cwt-avgResults.Rdata -><br>
> > > > > > > > > > > >
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/shared/101-FBchannel10_cwt-
avgResults.Rdata<br>
> > > > > > > > > > > > ... (total
28 links, the same number as the number of the expected output files)<br>
> > > > > > > > > > > >
drwxr-xr-x 2 osg osgusers 4096 Sep 7 12:42 101_FB-
epochs.Rdata<br>
> > > > > > > > > > > >
drwxr-xr-x 3 osg osgusers 4096 Sep 7 12:42 scripts<br>
> > > > > > > > > > > >
-rw-r--r-- 1 osg osgusers 58 Sep 7 12:42 stderr.txt<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > >
globus-job-run <a moz-do-not-send="true"
href="http://osg.hpcc.nd.edu/jobmanager">osg.hpcc.nd.edu/jobmanager
</a> /bin/ls -al<br>
> > > > > > > > > > > >
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/shared<br>
> > > > > > > > > > > >
-rw-r--r-- 1 osg osgusers 4109752 Sep 7 12:41 101_FB-
epochs.Rdata<br>
> > > > > > > > > > > >
drwxr-xr-x 2 osg osgusers 4096 Sep 7 12:41 scripts<br>
> > > > > > > > > > > >
-rw-r--r-- 1 osg osgusers 571 Sep 7 12:41 seq.sh<br>
> > > > > > > > > > > >
-rw-r--r-- 1 osg osgusers 3278 Sep 7 12:41 wrapper.sh<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > > empty
kickstart directory
<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > >
globus-job-run <a moz-do-not-send="true"
href="http://osg.hpcc.nd.edu/jobmanager">osg.hpcc.nd.edu/jobmanager</a>
/bin/cat
<br>
> > > > > > > > > > > >
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/status/cwtsmall-0bhnnvgi-error<br>
> > > > > > > > > > > > The
following output files were not created by the application:
<br>
> > > > > > > > > > > >
/dscratch/osg/app/osg/jtie/SIDGrid/wavelet.sh<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > >
<br>
> > > > > > > > > > > >
jobmanager-condor:<br>
> > > > > > > > > > > >
-------------------------------<br>
> > > > > > > > > > > > Application
exception: The following output files were not created by
<br>
> > > > > > > > > > > > the
application: 101-FBchannel20_cwt-avgResults.Rdata<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > >
globus-job-run <a moz-do-not-send="true"
href="http://osg.hpcc.nd.edu/jobmanager">osg.hpcc.nd.edu/jobmanager</a>
/bin/ls -al<br>
> > > > > > > > > > > >
/dscratch/osg/data/osg/jtie/sid-wf1-3my5pn01t3ov0/cwtsmall-cpteovgi<br>
> > > > > > > > > > > >
lrwxrwxrwx 1 osg osgusers 76 Sep 7 13:01 101_FB-epochs.Rdata -><br>
> > > > > > > > > > > >
/dscratch/osg/data/osg/jtie/sid-wf1-3my5pn01t3ov0/shared/101_FB-
epochs.Rdata<br>
> > > > > > > > > > > >
drwxr-xr-x 3 osg osgusers 4096 Sep 7 13:01 scripts<br>
> > > > > > > > > > > >
-rw-r--r-- 1 osg osgusers 70 Sep 7 13:01 stderr.txt<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > >
globus-job-run <a moz-do-not-send="true"
href="http://osg.hpcc.nd.edu/jobmanager">osg.hpcc.nd.edu/jobmanager
</a> /bin/ls -al<br>
> > > > > > > > > > > >
/dscratch/osg/data/osg/jtie/sid-wf1-3my5pn01t3ov0/shared<br>
> > > > > > > > > > > >
-rw-r--r-- 1 osg osgusers 4109752 Sep 7 13:00 101_FB-
epochs.Rdata<br>
> > > > > > > > > > > >
drwxr-xr-x 2 osg osgusers 4096 Sep 7 13:00 scripts<br>
> > > > > > > > > > > >
-rw-r--r-- 1 osg osgusers 571 Sep 7 13:00 seq.sh<br>
> > > > > > > > > > > >
-rw-r--r-- 1 osg osgusers 3278 Sep 7 13:00 wrapper.sh<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > > I think the
descriptions of exception are all right now. The
<br>
> > > > > > > > > > > > difference
between fork and condor was that fork created the output<br>
> > > > > > > > > > > > links to
the shared directory, but condor didn't. But the essential
<br>
> > > > > > > > > > > > problem is
the output files not being created. I will do more<br>
> > > > > > > > > > > > experiments
to see whether the problem of file system or application.
<br>
> > > > > > > > > > > ><br>
> > > > > > > > > > > > Thanks,<br>
> > > > > > > > > > > > Jing<br>
> > > > > > > > > > > >
_______________________________________________
<br>
> > > > > > > > > > > > Swift-user
mailing list<br>
> > > > > > > > > > > > <a
moz-do-not-send="true" href="mailto:Swift-user@ci.uchicago.edu">Swift-user@ci.uchicago.edu
</a><br>
> > > > > > > > > > > > <a
moz-do-not-send="true"
href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-user">http://mail.ci.uchicago.edu/mailman/listinfo/swift-user</a><br>
> > > > > > > > > > > >
<br>
> > > > > > > > > > ><br>
> > > > > > > > > > ><br>
> > > > > > > > > ><br>
> > > > > > > > ><br>
> > > > > > > > >
_______________________________________________<br>
> > > > > > > > > Swift-user mailing list<br>
> > > > > > > > > <a moz-do-not-send="true"
href="mailto:Swift-user@ci.uchicago.edu">
Swift-user@ci.uchicago.edu</a><br>
> > > > > > > > > <a moz-do-not-send="true"
href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-user">http://mail.ci.uchicago.edu/mailman/listinfo/swift-user</a><br>
> > > > > > > > >
<br>
> > > > > > > ><br>
> > > > > > > ><br>
> > > > > > ><br>
> > > > > ><br>
> > > > > ><br>
> > > ><br>
> > > > _______________________________________________
<br>
> > > > Swift-user mailing list<br>
> > > > <a moz-do-not-send="true"
href="mailto:Swift-user@ci.uchicago.edu">Swift-user@ci.uchicago.edu</a><br>
> > > > <a moz-do-not-send="true"
href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-user">
http://mail.ci.uchicago.edu/mailman/listinfo/swift-user</a><br>
> > > ><br>
> > ><br>
> > ><br>
> <br>
> <br>
<pre wrap="">
<hr size="4" width="90%">
_______________________________________________
Swift-user mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Swift-user@ci.uchicago.edu">Swift-user@ci.uchicago.edu</a>
<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-user">http://mail.ci.uchicago.edu/mailman/listinfo/swift-user</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Ian Foster, Director, Computation Institute
Argonne National Laboratory & University of Chicago
Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439
Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637
Tel: +1 630 252 4619. Web: <a class="moz-txt-link-abbreviated" href="http://www.ci.uchicago.edu">www.ci.uchicago.edu</a>.
Globus Alliance: <a class="moz-txt-link-abbreviated" href="http://www.globus.org">www.globus.org</a>.
</pre>
</body>
</html>