[Swift-user] XDTM

Michael Wilde wilde at mcs.anl.gov
Fri Aug 14 10:38:07 CDT 2009


On 8/7/09 3:40 PM, J A wrote:
> Hi Michael:
> 
>    1. After running a .swift or .dtm code, two files gets created:  .xml
>       and .klm.  What do they represent?

.xml is an xml version of the parsed .swift file
.kml (not klm) is the xml representation of the Karajan script that the 
Swift script is translated into for execution. Its actually the .kml 
file that is executed by Karajan which drives the execution of a Swift 
script.

>    2. Correct me if I am wrong:
>           * Datasets are mapped to physical presentation using mapping
>             algorithms.  Some mapping algorithms already created part of
>             swift and the user can add/create others and use the
>             existing once as the base.

Yes, thats right.

But, to clarify this part:
  "user can add/create others and use the existing ones as the base."

The user can use existing mappers, and add new mappers, either in Java 
or as external executables or scripts. But each mapper is independent. 
When you say "can use existing ones as a base" I would say thats 
correct, in that a user could *copy* and modify the code of one mapper 
to create another mapper, or, in the case of an "ext" mapper, one ext 
mapper could conceivably execute another and modify/filter its output to 
create a new mapping.

>           * Currently, the physical representation are files.

Yes, if you mean to say that mappers map files to Swift variables.

>    3. In the fMRI example, I see volume, Image, etc declared as
>       a type?   who defines them as a type?


>    4. In one of your emails, you stated that Swift functions can take
>       accept files, int, string, float and boolean values as arguments.
>       They return files, or scalar values inside files. My question is: 
>       if the output is a string that is inside a file, how can I use
>       this output in another program that takes it as an input?  doesn't
>       call the file name and should have a code to read from the file?

Yes, you can use readData() or readData2() to read the contents of a 
file back into Swift variables (including into arrays and structures, if 
the output has some structure).

>    5. I am still confused when talk about XML Data Type and Mapping. 
>       Where is the XML representation?  Is it the .xml that gets
>       generated when run the swift code?

No, the XML - if indeed it still exists - is only internal. I described 
it this way in an earlier post:

--

"As Swift evolved from its early prototypes to a more mature system, the 
notion of XDTM evolved to one of mapping between filesystem-based 
structures and Swift in-memory data structures (ie, scalars, arrays, and 
structures, which can be nested and typed).

This is best seen by looking at the "external" mapper, ...

In other words, it still has the flavor of XDTM, but without any XML 
being visible to the user. It meets the same need but is easier to use 
and explain."

--

When XDTM was first implemented, by Yong Zhao, he used XML within Swift 
to represent the mapping. I am not even sure if this XML representation 
is still used in the current implementation, or not. I suspect *not*.

But the important concept here should really be called "DTM" - dataset 
typing and mapping - and its embodied in the type model and mapping 
model of the language.

So you should stop thinking about data typing and mapping as being 
connected in any way to XML.

What we described in earlier papers as XDTM is not something that you 
can experiment with in terms of XML: ie, you can not see the XML for a 
mapping because its either deep inside the Swift implementation, or it 
no longer exists in the current Swift code.

>    6. Let's look at this example:
> 
>     type messagefile {}
>      
>     (messagefile t) greeting (string s[]) {  
>         app {
>             echo s[0] s[1] s[2] stdout=@filename(t
>     <mailto:stdout=@filename(t>);
>         }
>     }
>      
>     messagefile outfile <"q5out.txt">;
>      
>     string words[] = ["how","are","you"];
>      
>     outfile = greeting(words);
>     ===
>      
>     So we have messagefile as a data type.  outfile and words are
>     datasets.  what will be the physical representation for these 2
>     datasets?

An object of type messagefile will be represented as a single physical 
file externally, and internally as a scalar variable.

Words is a an array of strings.

Each atomic Swift variable (ie, scalars, array members, and structure 
members) can be thought of as a triple:

   (set-state, mapping, value)

All variables have a set-state; initially unset, then set when the 
variable is assigned a value.

File-valued variables have only a mapping, but no value.
Scalar-values (ie, non-mapped variables like strings, as in your 
example) have a value (eg the string, interger, boolean or float value) 
but no mapping.

We're still looking for better terminology to describe this; the current 
user guide uses both the terms "mapped type" and "marker type" to denote 
a file-valued variable. Both terms refer to the same concept; Im leaning 
to the term "mapped type".

is thee system parsing the swift code, identifying the
>     data types and datasets and based on that choosea the proper mapping
>     algorithm needed?

After the Swift command parses the Swift code, execution begins - i.e. 
the .kml file is executed by Karajan. Mappers are called as can be seen 
the kml. (And you can see their actions in the swift .log file).

The mapping for all mapped variables is either specified by the user 
(the most common case) or defaults to concurrent_mapper.

The users guide describes this in pretty good detail.

I hope that gets you a bit further. I hope that looking at XML mappings 
is not critical to your research, as I don't think you'll be able to 
readily get an XML intermediate form out of Swift.

An interesting topic would be to implement mechanisms to handle data in 
XML representations, in particular to enable Swift to invoke SOAP 
services as well as file-based applications and to compose scripts that 
call both forms of application.

- Mike

>      
> 
> Thanks,
> Jamal
>  
> 
>  
> 
> 
> 
> On Sun, Jul 26, 2009 at 9:53 PM, Michael Wilde <wilde at mcs.anl.gov 
> <mailto:wilde at mcs.anl.gov>> wrote:
> 
>     Hi Jamal,
> 
>     A lot of this is covered in the Swift user guide and tutorial. Have
>     you read through those yet?
> 
>     All the docs are at: http://www.ci.uchicago.edu/swift/docs/index.php
> 
>     If so, and the clarifications below don't help, please ask again on
>     the list, OK?
> 
>     - Mike
> 
> 
> 
>     On 7/26/09 7:27 PM, J A wrote:
> 
>         Hi Michael:
>          First, thank you for your reply and information provided.
>          I am trying to understand more how it handles the input/output
>         parameters and make them available for other functions.
> 
> 
>     All functions in Swift are either atomic interfaces to application
>     programs (ie, how o exec the program) or composite higher level
>     functions.
> 
>          To illustrate, I will give this example for the sake of discussion:
>          I have a C program called test.c that contains 4 functions (
>         main(), F1, F2, and F3).  each function takes some parameters
>         such as int, string, name of a file that is in the same
>         directory, and each one produced some output (string, int, and a
>         file).  Of course i am using global variables.  Now, main calls
>         F1, F1 passes its output to F2, and F2 passes its output to F3.
> 
> 
>     Swift doesnt look at the functions inside an application. It invokes
>     the application as a program (think fork/exec) just like a shell
>     would, but distributed and in parallel if so specified.
> 
>          Overall, the test.c takes an int, string, and file, and output
>         several files.  the output files contains output produced by the
>         internal functions (tasks).
> 
> 
>     Swift functions can take accept files, int, string, float and
>     boolean values as arguments. They return files, or scalar values
>     inside files. (Again, think shell scripts).  Composite structures -
>     structs and arrays - of the above can be passed.
> 
>          I would like to understand more when i transfer my code to
>         Swift how it handles the input/output data, where it stores
>         them, etc.  I read couple of papers about XDTM and still have
>         some confusion about the terms:  dataset, typed, how/where its
>         physical representation is located at, and how the input/output
>         is used within the internal functions.
> 
> 
>     Files are by default named ("mapped") relative to the directory in
>     which you run the Swift command. Many flexible extensions to that
>     model are provided for (eg, URIs).  Swift sends the data to the site
>     chosen for execution (thats yet another topic) and returns results
>     back to the same submission host.
> 
>     Mapping declarations in the Swift script specify how files and
>     directory structures are mapped to Swift variables (scalars, arrays,
>     structures). These are used in the specification of the Swift code.
>     When Swift runs programs, it takes files that were mapped and knows
>     how to send them to grid sites or clusters and get data back.
> 
>           I am new to this area and trying to understand how the DTM works.
>          Any help from your side on this area is really appreciated.
>          Thanks,
>         Jamal
>          
>          On Sun, Jul 26, 2009 at 7:09 PM, Michael Wilde
>         <wilde at mcs.anl.gov <mailto:wilde at mcs.anl.gov>
>         <mailto:wilde at mcs.anl.gov <mailto:wilde at mcs.anl.gov>>> wrote:
> 
>            Jamal,
> 
>            As Swift evolved from its early prototypes to a more mature
>         system,
>            the notion of XDTM evolved to one of mapping between
>            filesystem-based structures and Swift in-memory data
>         structures (ie,
>            scalars, arrays, and structures, which can be nested and typed).
> 
>            This is best seen by looking at the "external" mapper, which
>         allows
>            a user to map a dataset using any external program (typically a
>            script) that returns the members of the dataset as a two-column
>            list: the Swift variable reference, and the external file or URI.
> 
>            See the user guide section on the external mapper:
> 
>            
>         http://www.ci.uchicago.edu/swift/guides/userguide.php#mapper.ext_mapper
>            (but the example in the user guide doesn't show the power of
>         mapping
>            to nested structures).
> 
>            In other words, it still has the flavor of XDTM, but without
>         any XML
>            being visible to the user. It meets the same need but is
>         easier to
>            use and explain.
> 
>            - Mike
> 
> 
>            On 7/26/09 2:50 PM, J A wrote:
> 
>                Hi All:
>                 Can any one direct me to a source with more
>                examples/explanation on how XDTM is working/implemented?
>                 Thanks,
>                Jamal
>                
>              
>          ------------------------------------------------------------------------
> 
>                _______________________________________________
>                Swift-user mailing list
>                Swift-user at ci.uchicago.edu
>         <mailto:Swift-user at ci.uchicago.edu>
>         <mailto:Swift-user at ci.uchicago.edu
>         <mailto:Swift-user at ci.uchicago.edu>>
> 
>                http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> 
> 
> 



More information about the Swift-user mailing list