[Swift-user] Using swift with falkon on teraport
Michael Wilde
wilde at mcs.anl.gov
Tue Mar 25 11:18:42 CDT 2008
Hi Quan,
I'm doing something similar at the moment on machines at Argonne.
Do you already have Falkon built? (I'm using the attached file of notes
that I compiled from Ioan).
I run swift and falkon together on a host that has access to the cluster
shared filesystem, which in your case would be tp-login (or better yet a
cluster node that you can allocate using qsub -I, as as not to over-tax
a login host).
I use the local data provider, so that swift uses direct
shared-filesystem access to move data back and forth and do directory
and status file management.
Here's my sites file:
Below is my working doc of info form Ioan and Zhao, also attached in word.
- Mike
<pool handle="sico">
<gridftp url="local://localhost"/>
<execution provider="deef"
url="http://140.221.37.30:50001/wsrf/services/GenericPortal/core/WS/GPFactoryService"/>
<workdirectory>/home/wilde/swiftwork</workdirectory>
</pool>
Compiling Swift with Falkon support:
when you build Swift, add the -Dwith-provider-deef option:
cd ${FALKON_ROOT}/cog/modules/vdsk/
ant -Dwith-provider-deef redist
Security Note
BGexec supports no security
they connect back to the Falkon service and get work from there
they don't have any server sockets
so someone would have to hijack the connections and fake the service
for them to inject jobs to the workers...
if the workers would have had server sockets listening on some ports
then it would be different
but they are simple clients that only generate outgoing connections to a
specific IP
the service IP
and the Falkon service can run on the same box with Swift
behind a firewall
with only 3 ports open
Java Needs
IA64 nodes require Java 1.4
work up to 1.6
Falkon Tarball
wget http://people.cs.uchicago.edu/~iraicu/source/falkon-r83.tgz
tar xfz falkon-r83.tgz
cd falkon-r83/
source falkon.env
if you want to re-build (not needed for this tar ball)
falkon-clean.sh
falkon-build.sh
Building Falkon
The SVN archive has grown rather large recently, and some of the
directories (i.e. workloads and AstroPortal) make up the largest part of
the contents. With its current organization, here is how you would do a
minimal checkout (~43MB, Falkon User Guide, Section 2.1,
http://dev.globus.org/images/0/0e/Falkon_User_Guide_v2.pdf), and compile:
export ANT_HOME=/home/wilde/ant/dist
svn co https://svn.globus.org/repos/falkon -N
cd falkon
svn co https://svn.globus.org/repos/falkon/bin
source falkon.env
falkon-checkout-minimal.sh
source falkon.env
falkon-build.sh
This checkout takes 62 seconds for me, and the compile takes 43 seconds.
BTW, the entire thing (including all .svn dirs and compiled) is 148MB
after a clean checkout and compilation.
Starting Falkon
On screen 1:
cd falkon-r83
source falkon.env
falkon-service-stdout.sh 50001 config/Falkon-TCPCore.config
On screen 2:
cd falkon-r83
source falkon.env
falkon-worker-stdout.sh localhost 50001
at this point, you have the service running... press any key and enter
at the worker to terminate
BGexec’s on sico:
The file: /home/iraicu/java/svn/falkon/worker/ServiceName.txt
points each BGexec to where the service is running
so you need to update that file prior to starting the BGexecs with the
IP of the service
then to start them:
cd ~iraicu/java/svn/falkon/worker
./run.drp-slurm.sh 6 60
this would start 6 BGexecs for 60 minutes
you might need to copy over the BGexec source (1 file) and compile it on
the SiCo itself
and the starting scripts (2 of them)
Testing:
create a 3rd screen
cd falkon-r83
source falkon.env
falkon-client.sh 140.221.37.30 50001 workloads/sleep/sleep_1x10
the IP can also be localhost at this point
Debugging and Logs
here are the logs you need to make sure you capture when running in
debug mode:
cd ~/java/svn/falkon/config
cat Falkon-TCPCore.config
GenericPortalWS=falkon_task_submission_history.txt
GenericPortalWS_perf_per_sec=falkon_summary.txt
GenericPortalWS_taskPerf=falkon_task_perf.txt
GenericPortalWS_task=falkon_task_status.txt
When running in normal mode (when we know things work fine), we just need:
cd ~/java/svn/falkon/config
cat Falkon-TCPCore.config
GenericPortalWS_perf_per_sec=falkon_summary.txt
GenericPortalWS_taskPerf=falkon_task_perf.txt
In the event that we can't figure out things from the Swift and Falkon
service logs, we might have to enable worker side logs as well, which
you do from the run.worker-c.sh (or run.worker-c-ram.sh) script(s).
On 3/25/08 11:01 AM, Quan Tran Pham wrote:
> Hi,
>
> I just wonder if anyone has run Swift with Falkon on teraport? How do
> you config Swift (sites.xml (no sample on falkon), tc.data (no need to
> change?)). I find a link about Swift and Falkon here
> http://dev.globus.org/wiki/Incubator/Falkon#Project_Branches , but the
> link to the article has no content.
>
> I am try ing to: run falkon on tp-login, run swift on that same
> machine to submit jobs to falkon to run on teraport.
>
> Thank you very much
>
> Quan Pham
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Falkon.SiCo.FromIoan.2008.0311.doc
Type: application/msword
Size: 37376 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20080325/79a9a9a6/attachment.doc>
More information about the Swift-user
mailing list