[Swift-devel] Re: Analysis of wrapper.sh
Michael Wilde
wilde at mcs.anl.gov
Tue Jul 29 12:27:08 CDT 2008
Thanks, Zhao.
Thats a good start. Where I want you to take this (with help form me and
others on the team) is to create a detailed description of how data
flows in Swift, for use by both end users and developers.
What you show here so far is mainly the wrapper code itself.
I'm looking for a diagram that shows the three main data locations, and
explains the important stages in data management during a workflow, and
what they mean, why they are done.
The three areas are: the data file's original location when the mapper
sees them; the shared dir on each site; the work dir on each compute node.
Examples of questions I'd like this to cover are:
why do we have a shared dir? (Answer: to re-use transfered or generated
files within a workflow without re-transfering).
whats the lifetime of this directory? what in it is persistent vs
removed after jobs and/or scripts complete?
when does output come back? Where to?
how are relative vs absolute pathnames handled?
how are URL-prefixed pathnames handled? (gsiftp://, http:// etc?)
which Swift properties affect data management?
Same for options in profiles?
how should wrappers be written that reference files installed as part of
the application?
what are various ways in which wrappers and apps can utilize worker node
disk, today?
what patches that Ben implemented for testing in March-April on the BGP
and Sicortex are integrated and which remain patches to be considered
for testing and integration?
Some of these questions are more useful and make more sense than others,
but this is the general thing I want to get documented.
- Mike
On 7/29/08 12:02 PM, Zhao Zhang wrote:
> Hi, All
>
> I made this analysis of wrapper.sh. Correct me, if there is anything
> wrong. Thanks.
>
> zhao
>
>
> SWIFT phase: When swift is started, it creates a directory with the
> workload name and a random string, something like
>
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc
> |
> |____info
> |
> |____kickstart
> |
> |____shared
> | |
> |
> |____wrapper.sh
> | |
> |
> |____seq.sh
> |
> |____status
>
>
> WRAPPER.SH phase
>
> In my test case, the BGexec received a task in such a format
> "shared/wrapper.sh sleep-l5clzzvi -jobdir l -e /bin/sleep -out
> stdout.txt -err stderr.txt -i -d -if -of -k -a 600"
> with the working dir "/home/zzhang/swift/sleep-20080724-1527-qakbkkcc"
>
> WORKING_DIRECTORY OPERATION
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc OPEN wrapper.log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc CHECK if
> jobid ($1, sleep-l5clzzvi ) is empty ----> empty exit with 254
>
> |
>
> |
>
> V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc Get -jobdir
> as $JOBDIR
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc CHECK if
> -jobdir ( l )is empty ----> empty exit with 254
>
> |
>
> |
>
> V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc mkdir -p
> $WFDIR/info/$JOBDIR (mkdir
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc/info/l )
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc rm -f
> "$WFDIR/info/$JOBDIR/${ID}-info" ( make a clean $ID-info file
> ID=sleep-l5clzzvi )
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc openinfo
> "$WFDIR/info/$JOBDIR/${ID}-info"
>
> ( openinfo
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc/info/l/sleep-l5clzzvi-info)
>
> creating log file
>
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT
> "LOG_START" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT "Wrapper"
> into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc mkdir -p
> $WFDIR/status/$JOBDIR (create status parent dir for the job "mkdir -p
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc/status/l ")
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PARSE the
> arguments
>
> ( EXEC=/bin/sleep, STDOUT=stdout.txt,
> STDERR=stderr.txt, STDIN=NULL,
>
> DIRS=null, INF=NULL, OUTF=NULL, KICKSTART=NULL)
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc Check if
> there are arguments after -a ----> empty exit with 254
>
> |
>
> |
>
> V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc change $@
> from "-a 600" to "600"
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc Check if
> "$SIWFT_JOBDIR_PATH" is NULL ----> NO, local copy
>
> |
>
> |
>
> V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc YES, shared
> file system. DIR=jobs/$JOBDIR/$ID ( DIR=jobs/l/sleep-l5clzzvi )
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc set PATH
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT all
> arguments into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT
> "CREATE_JOBDIR" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc mkdir -p $DIR
> (In the working dir, "mkdir -p jobs/l/sleep-l5clzzvi ")
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc CHECK if
> "mkdir" is successful ----> NO, exit with 254
>
> |
>
> |
>
> V
>
> YES, put "Created job directory : $DIR" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT
> "CREATE_INPUTDIR" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc Created all
> subdirs in $DIR as in $DIRS ( create the same tree in
> jobs/l/sleep-l5clzzvi as in input file dir)
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT
> "LINK_INPUTS" in to log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc CHECK file
> system type ----> local disck, cp all files in $PWD/shared to $DIR
>
> |
>
> |
>
> V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc shared file
> system, create links in $DIR for all files in $PWD/shared
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT "EXECUTE"
> into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc CHECK if
> "kickstart is enabled" ----> yes, use kickstart to run the job
>
> |
>
> |
>
> V
>
> NO, run the job with wrapper.sh
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT
> "EXECUTE_DONE" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT "Job ran
> successfully" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc CHECK if the
> out put dir tree is the same as the one in $OUTF ----> NO, exit with 254
>
> |
>
> |
>
> V
>
> YES, COPY all output files in $DIR back to $PWD/shared
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT
> "RM_JOBDIR" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc rm -rf $DIR
> ("rm -rf jobs/l/sleep-l5clzzvi ")
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT
> "TOUCH_SUCCESS" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc touch
> status/${JOBDIR}/${ID}-success ( touch status/l/)sleep-l5clzzvi-success )
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc PUT "END"
> into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc closeinfo
More information about the Swift-devel
mailing list