[Swift-devel] Configuring Swift to access MosaStore
Emalayan Vairavanathan
svemalayan at yahoo.com
Mon Mar 5 17:25:05 CST 2012
Please find the attached setup.
Thank you
Emalayan
________________________________
From: Jonathan Monette <jonmon at mcs.anl.gov>
To: Emalayan Vairavanathan <svemalayan at yahoo.com>
Cc: Michael Wilde <wilde at mcs.anl.gov>; "emalayan at ece.ubc.ca" <emalayan at ece.ubc.ca>; "matei at ece.ubc.ca" <matei at ece.ubc.ca>; "swift-devel at ci.uchicago.edu" <swift-devel at ci.uchicago.edu>; "mosastore at googlegroups.com" <mosastore at googlegroups.com>; Jonathan Monette <jon.monette at gmail.com>
Sent: Monday, 5 March 2012 3:07 PM
Subject: Re: [Swift-devel] Configuring Swift to access MosaStore
If you could provide the set up you were using that would be great. I can fill in anything missing an do my tests to verify.
On Mar 5, 2012, at 13:34, Emalayan Vairavanathan <svemalayan at yahoo.com> wrote:
Thank you Jon.
>
>
>
>Yesterday I successfully run Mosa (on our cluster) with cdm-direct mode with the help of swift-user manual and the scripts available in /cog/modules/swift/tests/cdm/absolute.
>
>
>It would be useful if you can develop a simple test case. I can double check with my test case.
>
>
>Thank you
>Emalayan
>
>
>
>
>________________________________
> From: Jonathan Monette <jonmon at mcs.anl.gov>
>To: Michael Wilde <wilde at mcs.anl.gov>
>Cc: "emalayan at ece.ubc.ca" <emalayan at ece.ubc.ca>; "matei at ece.ubc.ca" <matei at ece.ubc.ca>; "swift-devel at ci.uchicago.edu" <swift-devel at ci.uchicago.edu>; "mosastore at googlegroups.com" <mosastore at googlegroups.com>; Jonathan Monette <jon.monette at gmail.com>
>Sent: Monday, 5 March 2012 7:14 AM
>Subject: Re: [Swift-devel] Configuring Swift to access MosaStore
>
>Yea. I will get demo scripts together for the mosa tests.
>
>On Mar 5, 2012, at 8:17, Michael Wilde <wilde at mcs.anl.gov> wrote:
>
>> was:
Re: [Swift-devel] coasters-hosts.pl script
>>
>> Jon, can you create a demo script that shows how to configure a Swift run to use MosaStore. The following approach may work:
>>
>> - Assume MosaStore will be mounted as /mosa to all workers
>>
>> - Simulate this with a localhost run, using /tmp/mosa, then do same with *1* worker, N jobs per node (eg 4 on BG/P, 8 on PADS, 2 on Beagle).
>>
>> - Set CDM direct mode for all paths starting with [/tmp]/mosa. You might need to work through some of the issues with CDM direct where accesses need to match both /tmp/mosa and file:///tmp/mosa (I *think*)
>>
>> - Map some temporary output-to-input files to /tmp/mosa; create a multi-level "catsncats"-like workflow to exercise it; the recent ParameterSweep example, perhaps extended to do N levels of fan-in/fan-out and pass-N might be a good test.
>>
>> - see if you can get _concurrent to get placed on
/tmp/mosa
>>
>> I think some of these tests would be a great test case for Swift/Turbine as well.
>>
>> You can do this is stages; the simple test of mapping CDM-direct files to /tmp/mosa should give Emalayan an initial test case to run once Mosa is ready on the BG/P.
>>
>> - Mike
>>
>>
>> ----- Original Message -----
>>> From: "Matei Ripeanu" <matei.ripeanu at gmail.com>
>>> To: mosastore at googlegroups.com, "Jonathan Monette" <jonmon at mcs.anl.gov>, "Justin M Wozniak" <wozniak at mcs.anl.gov>
>>> Cc: swift-devel at ci.uchicago.edu, emalayan at ece.ubc.ca
>>> Sent: Friday, March 2, 2012 6:29:17 PM
>>> Subject: Re: [Swift-devel] coasters-hosts.pl script
>>> Indeed this is good news! Thank you.
>>>
>>>
>>>
>>> Our next task, I think, will be to figure out how to configure Swift
>>> so that the headnode (where Swift runs) will not require any access to
>>> intermediate storage (MosaStore). Only the worker nodes will have
>>> access to intermediate storage. This is to go around the one way
>>> headnode-worker node connectivity issue.
>>>
>>>
>>>
>>> Any guidance on how to get this configuration would be much
>>> appreciated.
>>>
>>>
>>>
>>> Thank you again,
>>>
>>>
>>>
>>> -Matei
>>>
>>>
>>>
>>>
>>>
>>> From: mosastore at googlegroups.com [mailto:mosastore at googlegroups.com]
>>> On Behalf Of Emalayan Vairavanathan
>>> Sent: March-02-12 2:32 PM
>>> To: Jonathan Monette; Justin M Wozniak
>>> Cc: swift-devel at ci.uchicago.edu Devel; emalayan at ece.ubc.cais ;
>>> MosaStore
>>> Subject: Re: [Swift-devel] coasters-hosts.pl script
>>>
>>>
>>>
>>>
>>>
>>> Thank you Jon and Justin.
>>>
>>>
>>>
>>>
>>>
>>> This is a great news. I will get back to you if I have questions.
>>>
>>>
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>> Emalayan
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> From: Jonathan Monette < jonmon at mcs.anl.gov >
>>> To: Justin M Wozniak < wozniak at mcs.anl.gov >
>>> Cc: " swift-devel at ci.uchicago.edu Devel " <
>>> swift-devel at ci.uchicago.edu >; emalayan at ece.ubc.ca
>>> Sent: Friday, 2 March 2012 2:21 PM
>>> Subject: Re: [Swift-devel] coasters-hosts.pl script
>>>
>>>
>>> Emalayan,
>>> We believe we have fixed the issue. You can copy the new
>>> coasters-hosts.pl script from
>>> ~jonmon/surveyor/worker-init-test/coasters-hosts.pl
>>>
>>> This script reads the worker logs located in the logs directory. The
>>> steps to run are as follows:
>>> start-coaster-service
>>> <wait for workers to start>
>>> ./coasters-hosts.pl logs/worker-*.log > worker-hosts.txt
>>>
>>> You MUST clean out the worker logs after you before you start a new
>>> coaster service to make sure the script searches the right
worker log
>>> files. This may not be ideal at the moment but this will help get you
>>> started. If you have any other questions feel free to ask. We will
>>> need to update the mosaswift site with the new information, we will do
>>> this soon.
>>>
>>> On Mar 2, 2012, at 11:26 AM, Jonathan Monette wrote:
>>>
>>>> Can we match this line: 2012/03/02 17:16:04.712 INFO - Running on
>>>> node 172.18.1.83 from the worker log,
>>>> instead of this line: 2012-03-02 17:21:25,214+0000 DEBUG Cpu worker
>>>> started: block=2012.0302.171344.704 host=172.18.1.83 id=0 from the
>>>> cps log?
>>>>
>>>> They both provide the same ip addresses. And the worker log always
>>>> has that ip address before the cps log does.
>>>>
>>>> On Mar 2, 2012, at 11:15 AM, Jonathan Monette wrote:
>>>>
>>>>> That fix still did not work. I had moved it to the same spot. It is
>>>>> still waiting for the worker-init.pl script to finish before the ip
>>>>> addresses are printed to the cps log. Those ip addresses are what
>>>>> is needed by the coaster-hosts.pl script to finish. If I create an
>>>>> empty file for the coaster-host.pl script to read, then the work
>>>>> continues and the ip addresses show up in the cps log.
>>>>>
>>>>> Why is log4j waiting to add those lines to the cps log after the
>>>>> worker-init.pl script is finished?
>>>>>
>>>>> On Mar 2, 2012, at 11:05 AM, Jonathan Monette wrote:
>>>>>
>>>>>> Thanks, in my copy I thought I had moved the reconnect to before
>>>>>> the init-cmd and it still wasn't working. I will test with
your
>>>>>> change. I just verified that it was indeed waiting for the
>>>>>> worker-init.pl script to finish. I created an empty file for the
>>>>>> script to read and it finished connecting and the ip addresses I
>>>>>> needed were added to the cps log. I will also be testing your fix.
>>>>>>
>>>>>> On Mar 2, 2012, at 11:01 AM, Justin M Wozniak wrote:
>>>>>>
>>>>>>>
>>>>>>> Yes- I must have tested this with a different log file. I just
>>>>>>> checked in and installed in ~wozniak/Public a fix for this that
>>>>>>> launches WORKER_INIT_CMD after the reconnect(). I am a little
>>>>>>> worried about time outs but it works so far. I will continue
>>>>>>> testing...
>>>>>>>
Justin
>>>>>>>
>>>>>>> On Thu, 1 Mar 2012, Jonathan Monette wrote:
>>>>>>>
>>>>>>>> Justin,
>>>>>>>> So I have been trying to help Emalayan get the host list file
>>>>>>>> for the worker-init.pl script. It seems the cps log file is not
>>>>>>>> providing the ip addresses for the coasters-hosts.pl script. I
>>>>>>>> thought this was maybe because we did not have the correct log4j
>>>>>>>> setting set but we have the Coaster service Cpu set to DEBUG. So
>>>>>>>> for some reason the workers are not connecting to the service.
>>>>>>>> When I comment out the export WORKER_ENVIRONEMTN="…" line in the
>>>>>>>> coaster-service.conf file I see the workers connect and the
cps
>>>>>>>> log file shows there ip addresses. However when setting this
>>>>>>>> line it seems they are not connecting.
>>>>>>>>
>>>>>>>> Emalayan thought there might be some sort of circular dependency
>>>>>>>> going with the host-list file and the worker. The worker
>>>>>>>> requires the host-list file so that it can run the
>>>>>>>> worker-init.pl script and then connect but the host-list file
>>>>>>>> cannot be generated because the workers cannot connect. I
>>>>>>>> noticed in your swift-test directory the cps files did have the
>>>>>>>> ip addresses set and coasters-hosts.pl found the ip addresses
>>>>>>>> and reported them. Did you try that test with setting the
>>>>>>>>
WORKER_ENVIRONMENT variable in the coaster-service.conf file?
>>>>>>>> Any idea what may be happening? The job is running when looking
>>>>>>>> under cqstat.
>>>>>>>>
>>>>>>>> A side note: At the mosaswift site, your example talks about
>>>>>>>> running the coasters-hosts.pl on the cps log but the example you
>>>>>>>> provide runs it on logs/coasters.log. This may need to be
>>>>>>>> changed. Also, should provide the log4j setting that is required
>>>>>>>> to generate the Cpu line with the worker ip address just to
>>>>>>>> clarify that this line should be set for this script to work.
>>>>>>>>
>>>>>>>> For reference, this line:
>>>>>>>>
log4j.logger.org.globus.cog.abstraction.coaster.service.job.manager.Cpu=DEBUG
>>>>>>>
>>>>>>> --
>>>>>>> Justin M Wozniak
>>>>>>
>>>>>> _______________________________________________
>>>>>> Swift-devel mailing list
>>>>>> Swift-devel at ci.uchicago.edu
>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>>>>>
>>>>> _______________________________________________
>>>>> Swift-devel mailing list
>>>>> Swift-devel at ci.uchicago.edu
>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>>>>
>>>> _______________________________________________
>>>> Swift-devel mailing list
>>>> Swift-devel at ci.uchicago.edu
>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>>>
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "MosaStore" group.
>>> To post to this group, send email to mosastore at googlegroups.com .
>>>
To unsubscribe from this group, send email to
>>> mosastore+unsubscribe at googlegroups.com .
>>> For more options, visit this group at
>>> http://groups.google.com/group/mosastore?hl=en .
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>>
>> --
>> Michael Wilde
>> Computation Institute, University of Chicago
>> Mathematics and Computer Science Division
>> Argonne National
Laboratory
>>
>_______________________________________________
>Swift-devel mailing list
>Swift-devel at ci.uchicago.edu
>https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>
>
>
--
You received this message because you are subscribed to the Google Groups "MosaStore" group.
To post to this group, send email to mosastore at googlegroups.com.
To unsubscribe from this group, send email to mosastore+unsubscribe at googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mosastore?hl=en.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20120305/d9968d2d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: catsn.tar
Type: application/x-tar
Size: 20480 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20120305/d9968d2d/attachment.tar>
More information about the Swift-devel
mailing list