[Swift-user] follow up of kickstart executable not found problem
Jing Tie
tiejing at gmail.com
Fri Sep 7 12:35:03 CDT 2007
Hi,
I tried jobmanger-fork instead of jobmanager-condor on osg.hpcc.nd.edu site:
<recall the site used to have the exception - kickstart executable
(101-FBchannel18_cwt-avgResults.Rdata) not found>
jobmanager-fork:
------------------------
Application exception: The following output files were not created by
the application: /dscratch/osg/app/osg/jtie/SIDGrid/wavelet.sh
globus-job-run osg.hpcc.nd.edu/jobmanager /bin/ls -al
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/cwtsmall-0bhnnvgi
lrwxrwxrwx 1 osg osgusers 93 Sep 7 12:42
101-FBchannel10_cwt-avgResults.Rdata ->
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/shared/101-FBchannel10_cwt-avgResults.Rdata
... (total 28 links, the same number as the number of the expected output files)
drwxr-xr-x 2 osg osgusers 4096 Sep 7 12:42 101_FB-epochs.Rdata
drwxr-xr-x 3 osg osgusers 4096 Sep 7 12:42 scripts
-rw-r--r-- 1 osg osgusers 58 Sep 7 12:42 stderr.txt
globus-job-run osg.hpcc.nd.edu/jobmanager /bin/ls -al
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/shared
-rw-r--r-- 1 osg osgusers 4109752 Sep 7 12:41 101_FB-epochs.Rdata
drwxr-xr-x 2 osg osgusers 4096 Sep 7 12:41 scripts
-rw-r--r-- 1 osg osgusers 571 Sep 7 12:41 seq.sh
-rw-r--r-- 1 osg osgusers 3278 Sep 7 12:41 wrapper.sh
empty kickstart directory
globus-job-run osg.hpcc.nd.edu/jobmanager /bin/cat
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/status/cwtsmall-0bhnnvgi-error
The following output files were not created by the application:
/dscratch/osg/app/osg/jtie/SIDGrid/wavelet.sh
jobmanager-condor:
-------------------------------
Application exception: The following output files were not created by
the application: 101-FBchannel20_cwt-avgResults.Rdata
globus-job-run osg.hpcc.nd.edu/jobmanager /bin/ls -al
/dscratch/osg/data/osg/jtie/sid-wf1-3my5pn01t3ov0/cwtsmall-cpteovgi
lrwxrwxrwx 1 osg osgusers 76 Sep 7 13:01 101_FB-epochs.Rdata ->
/dscratch/osg/data/osg/jtie/sid-wf1-3my5pn01t3ov0/shared/101_FB-epochs.Rdata
drwxr-xr-x 3 osg osgusers 4096 Sep 7 13:01 scripts
-rw-r--r-- 1 osg osgusers 70 Sep 7 13:01 stderr.txt
globus-job-run osg.hpcc.nd.edu/jobmanager /bin/ls -al
/dscratch/osg/data/osg/jtie/sid-wf1-3my5pn01t3ov0/shared
-rw-r--r-- 1 osg osgusers 4109752 Sep 7 13:00 101_FB-epochs.Rdata
drwxr-xr-x 2 osg osgusers 4096 Sep 7 13:00 scripts
-rw-r--r-- 1 osg osgusers 571 Sep 7 13:00 seq.sh
-rw-r--r-- 1 osg osgusers 3278 Sep 7 13:00 wrapper.sh
I think the descriptions of exception are all right now. The
difference between fork and condor was that fork created the output
links to the shared directory, but condor didn't. But the essential
problem is the output files not being created. I will do more
experiments to see whether the problem of file system or application.
Thanks,
Jing
More information about the Swift-user
mailing list