[Swift-user] cobalt can't find wrapperlogs

Allan Espinosa aespinosa at cs.uchicago.edu
Tue Nov 3 17:23:52 CST 2009


I did.

$ ifconfig eth0; echo $GLOBUS_HOSTNAME; ./demompi.sh
eth0      Link encap:Ethernet  HWaddr 00:14:5E:9C:0D:82
          inet addr:172.17.5.144  Bcast:172.31.255.255  Mask:255.240.0.0
          inet6 addr: fe80::214:5eff:fe9c:d82/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1
          RX packets:88471902 errors:0 dropped:54 overruns:0 frame:222
          TX packets:84299690 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:67184589504 (64072.2 Mb)  TX bytes:69406208974 (66190.9 Mb)
          Interrupt:33

172.17.5.144
Swift svn swift-r3186 cog-r2577

RunID: run0
Progress:
Progress:  Stage in:1
Progress:  Submitted:1
Progress:  Active:1
Progress:  Active:1
Progress:  Active:1
Progress:  Active:1
Progress:  Active:1
Progress:  Active:1
Progress:  Active:1
Progress:  Checking status:1
Failed to transfer wrapper log from mpitest-run0/info/a on INTREPID
Progress:  Submitted:1
Progress:  Active:1
Progress:  Active:1
...
...

I also set it using the "env" namespace:
<profile namespace="env" key="GLOBUS_HOSTNAME">172.17.5.144</profile>

Yet it doesn't seem to be reflected in the cobalt logs:

workdir$grep GLOBUS *.cobaltlog
$

thanks,
-Allan


2009/11/3 Mihael Hategan <hategan at mcs.anl.gov>:
> Make sure you set GLOBUS_HOSTNAME to the IP of eth0 before running.
>
> On Tue, 2009-11-03 at 16:56 -0600, Allan Espinosa wrote:
>> Hi,
>>
>> I'm using a cobalt-only sites.xml to launch MPI jobs from the
>> BlueGene.  But when i inspected the workdir, no job directories were
>> created
>>
>> swift session:
>> Swift svn swift-r3186 cog-r2577
>>
>> RunID: run0
>> Progress:
>> Progress:  Submitted:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> 1Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Checking status:1
>> Failed to transfer wrapper log from mpitest-run0/info/p on INTREPID
>> Progress:  Submitted:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Checking status:1
>> Failed to transfer wrapper log from mpitest-run0/info/r on INTREPID
>> Progress:  Submitted:1
>> Progress:  Submitted:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Active:1
>> Progress:  Checking status:1
>> Failed to transfer wrapper log from mpitest-run0/info/t on INTREPID
>> Execution failed:
>>         Exception in hello:
>> Arguments: []
>> Host: INTREPID
>> Directory: mpitest-run0/jobs/t/hello-tlf9cyij
>> stderr.txt:
>>
>> stdout.txt:
>>
>> ----
>>
>> Caused by:
>>         No status file was found. Check the shared filesystem on INTREPID
>>
>> listing of workdir:
>> intrepid-fs0/users/espinosa/scratch/mpi_runs/mpitest-run0> find .
>> .
>> ./shared
>> ./shared/_swiftwrap
>> ./shared/_swiftseq
>> ./kickstart
>> ./status
>> ./info
>> ./200173.cobaltlog
>> ./200174.cobaltlog
>> ./200175.cobaltlog
>> ./200176.cobaltlog
>> ./200177.cobaltlog
>>
>>
>> sites.xml:
>> <config>
>> <pool handle="INTREPID">
>>    <filesystem provider="local" />
>>    <execution provider="cobalt"/>
>>    <profile namespace="globus" key="hostCount">64</profile>
>>    <profile namespace="globus" key="project">HTCScienceApps</profile>
>>    <profile namespace="globus" key="maxtime">20</profile>
>>    <profile namespace="globus" key="mode">vn</profile>
>>    <profile namespace="globus" key="queue">prod-devel</profile>
>>    <workdirectory >/intrepid-fs0/users/espinosa/scratch/mpi_runs</workdirectory>
>> </pool>
>> </config>
>>
>>
>> Where do you think these jobdirs were created?  I have also attached
>> the swift log in this email.
>>
>> -Allan



More information about the Swift-user mailing list