[Swift-user] Pointer to Swift tutorials for computational science education and research
Ketan Maheshwari
ketan at mcs.anl.gov
Thu Sep 11 08:52:18 CDT 2014
Andrew,
Just to be clear, in the new script that Yadu posted, you will need to add
one entry in your tc.data as follows:
persistent-coasters bash /bin/bash
Thanks,
Ketan
On Wed, Sep 10, 2014 at 11:03 PM, Yadu Nand <yadudoc1729 at gmail.com> wrote:
> Hi Andrew,
>
> If you would like to have swift move the executable for you, you could try
> the method used in the following example :
>
> type file;
>
> /* App definition calls bash which in-turn executes the bash script
> defined as an argument here
> Every file in the input parameter list is staged by swift to the worker
> nodes
> Every file in the return list is staged back to the client node by swift
> */
> app (file out, file err) foo (file script, file input)
> {
> bash @script @input stdout=@out stderr=@err;
> }
>
> // Script to be executed
> file wrapper <"wrapper.sh">;
> file hello <"hello.txt">;
>
> file out <"foo.out">;
> file err <"foo.err">;
>
> (out, err) = foo (wrapper, hello);
>
> Thanks,
> Yadu
>
>
>
> On Wed, Sep 10, 2014 at 8:45 PM, Andrew Stocker <amstocker at dons.usfca.edu>
> wrote:
>
>> Ketan,
>>
>> I copied the catnap executable to the same directory on each of the
>> computers and now the swift script is working perfectly without error.
>> Thanks for your help! What are the next steps we can take to set up our
>> cluster to not require the script to be on all the computers? Since we are
>> fairly new to parallel computing with a cluster, could you point us towards
>> any resources regarding the technical configuration for Swift? I've looked
>> at the documentation for tc.data but I am still a bit confused by it.
>>
>> Thanks,
>>
>> Andrew
>>
>> On Tue, Sep 9, 2014 at 7:49 AM, Ketan Maheshwari <ketan at mcs.anl.gov>
>> wrote:
>>
>>>
>>> On Mon, Sep 8, 2014 at 6:28 PM, Andrew Stocker <amstocker at dons.usfca.edu
>>> > wrote:
>>>
>>>> Thanks for your response!
>>>>
>>>> Since we're just in the stages of experimentation, our preliminary
>>>> cluster is just four iMacs connected to a switch. I set up password-less
>>>> ssh communication between the four and I'm able to start the coaster
>>>> service (in the folder with coaster-service.conf) without any errors. I am
>>>> running Swift from the computer which has the catnap.sh installed at
>>>> the correct path, and I'm pretty sure it has the executable bit set ( #!/bin/sh
>>>> is the first line of the program). None of the other three computers
>>>> have Swift installed, nor do they have catnap.sh at the location
>>>> specified in tc.data, is this a problem?
>>>>
>>>
>>> Yes, that seems to be the issue. The executable--catnap.sh in this case
>>> must be available on all compute nodes in the location specified in the tc.
>>>
>>> An alternative in this case is to use catnap.sh as data and move it
>>> along with data to target compute nodes. However, we can do that later.
>>> For now, could you try to put catnap.sh in a common location on each of the
>>> compute nodes and try again.
>>>
>>> No, Swift is not needed to be installed on compute nodes. Swift just
>>> needs to be on the submit node.
>>>
>>>
>>>>
>>>> Attached is the log file from the run when I got the error I
>>>> copy+pasted above. Interestingly, when I run the catnap swift script with
>>>> only 3 concurrent instances, it seems to run fine since we allow 3 jobs per
>>>> node and so it is probably only running locally.
>>>>
>>>> Regards,
>>>> Andrew
>>>>
>>>> On Fri, Sep 5, 2014 at 6:03 PM, Ketan Maheshwari <ketan at mcs.anl.gov>
>>>> wrote:
>>>>
>>>>> Hi Andrew,
>>>>>
>>>>> Yes, I remember: thanks for getting back on this.
>>>>>
>>>>> From the error message and tc.data, indeed it looks like the
>>>>> executable is provided as absolute path but somehow Swift is looking into
>>>>> system path and not finding it. One possibility is that the node on which
>>>>> catnap.sh is running does not have it installed on the path specified in
>>>>> the tc.data. Can you also check if catnap.sh has the executable bit set.
>>>>> Less likely that this is causing the issue though.
>>>>>
>>>>> Also, from the tc.data line it looks like you are using persistent
>>>>> coasters. Have started the coaster service beforehand and made sure the
>>>>> service started correctly without any error messages. Could you indicate
>>>>> more about your cluster. Depending on the type of cluster, it is possible
>>>>> that we can run Swift in a non-persistent, implicit coasters mode.
>>>>>
>>>>> Can you also send the Swift generated log for this run.
>>>>>
>>>>> Thanks,
>>>>> Ketan
>>>>>
>>>>>
>>>>> On Fri, Sep 5, 2014 at 7:16 PM, Andrew Stocker <
>>>>> amstocker at dons.usfca.edu> wrote:
>>>>>
>>>>>> Hi Ketan,
>>>>>>
>>>>>> I'm not sure if you remember, but myself and my research advisor
>>>>>> Xiaosheng spoke to you at LBL in Oakland at the beginning of the summer
>>>>>> about starting to use Swift at our school. We have been working hard on
>>>>>> setting it up, and I am trying to get your demo to run but I'm having a
>>>>>> problem. For some reason I keep getting the following error when I try to
>>>>>> run your catsnsleep demo:
>>>>>>
>>>>>> Execution failed:
>>>>>> Exception in catnap:
>>>>>> Arguments: [5, data.txt]
>>>>>> Host: persistent-coasters
>>>>>> Directory:
>>>>>> catsnsleep-20140905-1702-mihcat06/jobs/u/catnap-utx080xl
>>>>>>
>>>>>> Caused by:
>>>>>> Cannot find executable catnap.sh on site system path
>>>>>> catnap, catsnsleep.swift, line 13
>>>>>>
>>>>>> However I'm not sure why. In our tc.data file we have the line:
>>>>>>
>>>>>> persistent-coasters catnap
>>>>>> /usr/local/swift-0.94.1/oakland-demo/catnap.sh
>>>>>>
>>>>>> which I think should work but obviously something is going wrong.
>>>>>> I have been browsing the documentation articles but I can't find anything
>>>>>> about why this might be happening. We would greatly appreciate your advice!
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Andrew Stocker
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 20, 2014 at 12:00 PM, Xiaosheng Huang <xhuang22 at usfca.edu
>>>>>> > wrote:
>>>>>>
>>>>>>>
>>>>>>> ---------- Forwarded message ----------
>>>>>>> From: Ketan Maheshwari <ketan at mcs.anl.gov>
>>>>>>> Date: Fri, Jun 20, 2014 at 11:45 AM
>>>>>>> Subject: Re: Pointer to Swift tutorials for computational science
>>>>>>> education and research
>>>>>>> To: Xiaosheng Huang <xhuang22 at usfca.edu>
>>>>>>> Cc: Wilde <wilde at mcs.anl.gov>
>>>>>>>
>>>>>>>
>>>>>>> Hi Xiaosheng,
>>>>>>>
>>>>>>> The tarball is: http://www.mcs.anl.gov/~ketan/oakland-demo.tgz
>>>>>>>
>>>>>>> There is a small README in there which outlines the steps.
>>>>>>>
>>>>>>> Best,
>>>>>>> Ketan
>>>>>>>
>>>>>>> ************************************************************
>>>>>>> Xiaosheng Huang, Assistant Professor
>>>>>>> Department of Physics and Astronomy
>>>>>>> University of San Francisco
>>>>>>> 2130 Fulton Street, San Francisco, CA 94117-1080
>>>>>>>
>>>>>>> Phone: (415) 422-6281
>>>>>>> E-mail: xhuang22 at usfca.edu
>>>>>>> ************************************************************
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu
>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>>
>
>
>
> --
> Yadu Nand B
>
>
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20140911/b04c717e/attachment.html>
More information about the Swift-user
mailing list