[Swift-user] follow up of kickstart executable not found problem

Jing Tie tiejing at gmail.com
Fri Sep 7 12:35:03 CDT 2007


Hi,

I tried jobmanger-fork instead of jobmanager-condor on osg.hpcc.nd.edu site:
<recall the site used to have the exception - kickstart executable
(101-FBchannel18_cwt-avgResults.Rdata) not found>

jobmanager-fork:
------------------------
Application exception: The following output files were not created by
the application: /dscratch/osg/app/osg/jtie/SIDGrid/wavelet.sh

globus-job-run osg.hpcc.nd.edu/jobmanager /bin/ls -al
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/cwtsmall-0bhnnvgi
lrwxrwxrwx  1 osg osgusers   93 Sep  7 12:42
101-FBchannel10_cwt-avgResults.Rdata ->
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/shared/101-FBchannel10_cwt-avgResults.Rdata
... (total 28 links, the same number as the number of the expected output files)
drwxr-xr-x  2 osg osgusers 4096 Sep  7 12:42 101_FB-epochs.Rdata
drwxr-xr-x  3 osg osgusers 4096 Sep  7 12:42 scripts
-rw-r--r--  1 osg osgusers   58 Sep  7 12:42 stderr.txt

globus-job-run osg.hpcc.nd.edu/jobmanager /bin/ls -al
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/shared
-rw-r--r--  1 osg osgusers 4109752 Sep  7 12:41 101_FB-epochs.Rdata
drwxr-xr-x  2 osg osgusers    4096 Sep  7 12:41 scripts
-rw-r--r--  1 osg osgusers     571 Sep  7 12:41 seq.sh
-rw-r--r--  1 osg osgusers    3278 Sep  7 12:41 wrapper.sh

empty kickstart directory

globus-job-run osg.hpcc.nd.edu/jobmanager /bin/cat
/dscratch/osg/data/osg/jtie/sid-wf1-2g48a936yixu1/status/cwtsmall-0bhnnvgi-error
The following output files were not created by the application:
/dscratch/osg/app/osg/jtie/SIDGrid/wavelet.sh


jobmanager-condor:
-------------------------------
Application exception: The following output files were not created by
the application: 101-FBchannel20_cwt-avgResults.Rdata

globus-job-run osg.hpcc.nd.edu/jobmanager /bin/ls -al
/dscratch/osg/data/osg/jtie/sid-wf1-3my5pn01t3ov0/cwtsmall-cpteovgi
lrwxrwxrwx  1 osg osgusers   76 Sep  7 13:01 101_FB-epochs.Rdata ->
/dscratch/osg/data/osg/jtie/sid-wf1-3my5pn01t3ov0/shared/101_FB-epochs.Rdata
drwxr-xr-x  3 osg osgusers 4096 Sep  7 13:01 scripts
-rw-r--r--  1 osg osgusers   70 Sep  7 13:01 stderr.txt

globus-job-run osg.hpcc.nd.edu/jobmanager /bin/ls -al
/dscratch/osg/data/osg/jtie/sid-wf1-3my5pn01t3ov0/shared
-rw-r--r--  1 osg osgusers 4109752 Sep  7 13:00 101_FB-epochs.Rdata
drwxr-xr-x  2 osg osgusers    4096 Sep  7 13:00 scripts
-rw-r--r--  1 osg osgusers     571 Sep  7 13:00 seq.sh
-rw-r--r--  1 osg osgusers    3278 Sep  7 13:00 wrapper.sh

I think the descriptions of exception are all right now. The
difference between fork and condor was that fork created the output
links to the shared directory, but condor didn't. But the essential
problem is the output files not being created. I will do more
experiments to see whether the problem of file system or application.

Thanks,
Jing



More information about the Swift-user mailing list