[Swift-devel] Re: Analysis of wrapper.sh

Michael Wilde wilde at mcs.anl.gov
Tue Jul 29 12:27:08 CDT 2008


Thanks, Zhao.

Thats a good start. Where I want you to take this (with help form me and 
others on the team) is to create a detailed description of how data 
flows in Swift, for use by both end users and developers.

What you show here so far is mainly the wrapper code itself.

I'm looking for a diagram that shows the three main data locations, and 
explains the important stages in data management during a workflow, and 
what they mean, why they are done.

The three areas are: the data file's original location when the mapper 
sees them; the shared dir on each site; the work dir on each compute node.

Examples of questions I'd like this to cover are:

why do we have a shared dir? (Answer: to re-use transfered or generated 
files within a workflow without re-transfering).

whats the lifetime of this directory? what in it is persistent vs 
removed after jobs and/or scripts complete?

when does output come back? Where to?

how are relative vs absolute pathnames handled?

how are URL-prefixed pathnames handled? (gsiftp://, http:// etc?)

which Swift properties affect data management?
Same for options in profiles?

how should wrappers be written that reference files installed as part of 
the application?

what are various ways in which wrappers and apps can utilize worker node 
disk, today?

what patches that Ben implemented for testing in March-April on the BGP 
and Sicortex are integrated and which remain patches to be considered 
for testing and integration?

Some of these questions are more useful and make more sense than others, 
but this is the general thing I want to get documented.

- Mike



On 7/29/08 12:02 PM, Zhao Zhang wrote:
> Hi, All
> 
> I made this analysis of wrapper.sh. Correct me, if there is anything 
> wrong. Thanks.
> 
> zhao
> 
> 
> SWIFT phase: When swift is started, it creates a directory with the 
> workload name and a random string, something like
> 
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc
>                                                        |
>                                                        |____info
>                                                        |
>                                                        |____kickstart
>                                                        |
>                                                        |____shared
>                                                        |            |
>                                                        |            
> |____wrapper.sh
>                                                        |            |
>                                                        |            
> |____seq.sh
>                                                        |
>                                                        |____status
> 
> 
> WRAPPER.SH phase
> 
> In my test case, the BGexec received a task in such a format
> "shared/wrapper.sh sleep-l5clzzvi -jobdir l -e /bin/sleep -out 
> stdout.txt -err stderr.txt -i -d  -if  -of  -k  -a 600"
> with the working dir "/home/zzhang/swift/sleep-20080724-1527-qakbkkcc"
> 
> WORKING_DIRECTORY                                                 OPERATION
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            OPEN wrapper.log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            CHECK if 
> jobid ($1, sleep-l5clzzvi ) is empty    ----> empty exit with 254
>                                                                        
>                                        |
>                                                                      
>                                          |
>                                                                      
>                                         V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            Get -jobdir 
> as $JOBDIR
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            CHECK if 
> -jobdir ( l )is empty      ----> empty exit with 254
>                                                                       
>                                         |
>                                                                      
>                                          |
>                                                                      
>                                         V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            mkdir -p 
> $WFDIR/info/$JOBDIR   (mkdir 
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc/info/l )
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            rm -f 
> "$WFDIR/info/$JOBDIR/${ID}-info" ( make a clean $ID-info file 
> ID=sleep-l5clzzvi )
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            openinfo 
> "$WFDIR/info/$JOBDIR/${ID}-info"
>                                                                      
>                       ( openinfo 
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc/info/l/sleep-l5clzzvi-info)
>                                                                        
>                      creating log file
> 
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT 
> "LOG_START" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT "Wrapper" 
> into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc               mkdir -p 
> $WFDIR/status/$JOBDIR (create status parent dir for the job "mkdir -p 
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc/status/l ")
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PARSE the 
> arguments
>                                                                        
>                     ( EXEC=/bin/sleep, STDOUT=stdout.txt, 
> STDERR=stderr.txt, STDIN=NULL,
>                                                                        
>                        DIRS=null, INF=NULL, OUTF=NULL, KICKSTART=NULL)
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            Check if 
> there are arguments after -a  ----> empty exit with 254
>                                                                       
>                                         |
>                                                                      
>                                          |
>                                                                      
>                                         V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            change $@ 
> from "-a 600" to "600"
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            Check if 
> "$SIWFT_JOBDIR_PATH" is NULL ----> NO, local copy
>                                                                       
>                                         |
>                                                                      
>                                          |
>                                                                      
>                                         V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            YES, shared 
> file system. DIR=jobs/$JOBDIR/$ID ( DIR=jobs/l/sleep-l5clzzvi )
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            set PATH
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT all 
> arguments into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT 
> "CREATE_JOBDIR" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            mkdir -p $DIR 
> (In the working dir, "mkdir -p jobs/l/sleep-l5clzzvi ")
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            CHECK if 
> "mkdir" is successful  ----> NO, exit with 254
>                                                                       
>                                         |
>                                                                      
>                                          |
>                                                                      
>                                         V
>                                                                        
>                     YES, put "Created job directory : $DIR" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT 
> "CREATE_INPUTDIR" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            Created all 
> subdirs in $DIR as in $DIRS ( create the same tree in 
> jobs/l/sleep-l5clzzvi as in input file dir)
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT 
> "LINK_INPUTS" in to log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            CHECK file 
> system type  ----> local disck, cp all files in $PWD/shared to $DIR
>                                                                      
>                                          |
>                                                                      
>                                          |
>                                                                      
>                                         V
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            shared file 
> system, create links in  $DIR for all files in $PWD/shared
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT "EXECUTE" 
> into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            CHECK if 
> "kickstart is enabled" ----> yes, use kickstart to run the job
>                                                                      
>                                          |
>                                                                      
>                                          |
>                                                                      
>                                         V
>                                                                        
>                     NO, run the job with wrapper.sh
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT 
> "EXECUTE_DONE" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT "Job ran 
> successfully" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            CHECK if the 
> out put dir tree is the same as the one in $OUTF ----> NO, exit with 254
>                                                                      
>                                          |
>                                                                      
>                                          |
>                                                                      
>                                         V
>                                                                        
>                    YES, COPY all output files in $DIR back to $PWD/shared
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT 
> "RM_JOBDIR" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            rm -rf $DIR 
> ("rm -rf jobs/l/sleep-l5clzzvi ")
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT 
> "TOUCH_SUCCESS" into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            touch 
> status/${JOBDIR}/${ID}-success ( touch status/l/)sleep-l5clzzvi-success )
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            PUT "END" 
> into log
> /home/zzhang/swift/sleep-20080724-1527-qakbkkcc            closeinfo



More information about the Swift-devel mailing list