[Swift-devel] [CI Ticketing System #1074] How to set .soft and env to run condor on TeraPort?

Ti Leggett support at ci.uchicago.edu
Fri Jun 19 08:49:17 CDT 2009


There were some misconfigurations in the @globus-4 macro for rhel-5 and condor
that I've just fixed. Can you set your ~/.soft to look like below and then run
resoft:

@globus-4

@default

You should be using /soft/condor-7.0.5-r1 and /soft/globus-4.2.1-r2 after that.
Let me know if that works for you, or if anything changes.

On Thu Jun 18 16:44:15 2009, wilde at mcs.anl.gov wrote:
> Hi,
>
> Swift users need to run the condor-g client in order to send jobs to
> OSG
> sites from a Swift script.
>
> Can you tell us how to set .soft and env so that condor_submit to
> "grid"
> universe works?
>
> We've had all sorts of problems in getting this to work well:
>
> - the version of condor client code on communicado is too new to run
> with Swift.
>
> - On teraport, it seems difficult to get the right settings of .soft
> entries and setup.sh scripts to work corrcetly together
>
> - I still dont know if what worked for Zhao on tp-osg a month ago
> still
> works. It seems not to, and I cant tell if its because of a change in
> .soft or env settings, or some other software issue
>
> - We would like to run from Teraport compute nodes with qsub -I, and
> hope that whatever we determine to be the right settings for login
> nodes
> work on interactive compute nodes as well.
>
> - It would be good *not* to run on tp-osg.
>
> Suchandra, Ti, or Greg, can you help us sort out how to set things
> correctly?
>
> Tanks,
>
> Mike
>
>
> -------- Original Message --------
> Subject: Re: [Swift-devel] Cant run condor-g on TeraPort
> Date: Thu, 18 Jun 2009 19:31:26 +0000 (GMT)
> From: Ben Clifford <benc at hawaga.org.uk>
> To: Michael Wilde <wilde at mcs.anl.gov>
> CC: swift-devel <swift-devel at ci.uchicago.edu>
> References: <4A3A93E2.2080805 at mcs.anl.gov>
>
>
> condor_q works for me on tp-osg if I source /opt/osg/setenv.sh rather
> than
> use softenv. it doesn't work for me if I use @osg in softenv, with the
> error you report.
>
> On Thu, 18 Jun 2009, Michael Wilde wrote:
>
> > As far as I can tell, the condor client code is broken on TeraPort.
> >
> > Ive tried this on tp-login and tp-osg; I am using +osg-client and
> @osg in my
> > .soft. I source $VDT_LOCATION/setup.sh
> >
> > Zhao, Glen, can you cross-check and see if you are now seeing the
> same thing?
> >
> > My suspicion is that the condor client config broke in the last
> month, through
> > OSG changes, CI Support work, etc etc.
> >
> > - Mike
> >
> >
> > I get this from condor_q:
> >
> > tp$ condor_q
> > Error:
> >
> > Extra Info: You probably saw this error because the condor_schedd is
> not
> > running on the machine you are trying to query. If the condor_schedd
> is not
> > running, the Condor system will not be able to find an address and
> port to
> > connect to and satisfy this request. Please make sure the Condor
> daemons are
> > running and try again.
> >
> > Extra Info: If the condor_schedd is running on the machine you are
> trying to
> > query and you still see the error, the most likely cause is that you
> have
> > setup a personal Condor, you have not defined SCHEDD_NAME in your
> > condor_config file, and something is wrong with your
> SCHEDD_ADDRESS_FILE
> > setting. You must define either or both of those settings in your
> config
> > file, or you must use the -name option to condor_q. Please see the
> Condor
> > manual for details on SCHEDD_NAME and SCHEDD_ADDRESS_FILE.
> > tp$
> >
> > and this from swift:
> >
> > tp-grid1$ swift -tc.file tc.data -sites.file sites.condorg.xml
> cat.swift
> > Swift svn swift-r2890 cog-r2392
> >
> > RunID: 20090618-1404-mo0thjj4
> > Progress:
> > Progress: Stage in:1
> > Progress: Submitted:1
> > Failed to transfer wrapper log from cat-20090618-1404-
> mo0thjj4/info/h on
> > firefly
> > Progress: Failed:1
> > Execution failed:
> > Exception in cat:
> > Arguments: [data.txt]
> > Host: firefly
> > Directory: cat-20090618-1404-mo0thjj4/jobs/h/cat-hv5s3gcj
> > stderr.txt:
> >
> > stdout.txt:
> >
> > ----
> >
> > Caused by:
> > Cannot submit job: Could not submit job (condor_submit reported an
> > exit code of 1). no error output
> > tp-grid1$ ls
> >
> > --
> >
> > Using this sites file:
> >
> > <config>
> > <pool handle="firefly" >
> > <gridftp url="gsiftp://ff-grid.unl.edu" />
> > <execution provider="condor" />
> > <profile namespace="globus" key="jobType">grid</profile>
> > <profile namespace="globus" key="gridResource">gt2
> > ff-grid.unl.edu/jobmanager-pbs</profile>
> > <workdirectory
> >/panfs/panasas/CMS/data/oops/wilde/swiftwork</workdirectory>
> > </pool>
> > </config>
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >
> >




More information about the Swift-devel mailing list