[Swift-user] cobalt can't find wrapperlogs
Allan Espinosa
aespinosa at cs.uchicago.edu
Tue Nov 3 17:23:52 CST 2009
I did.
$ ifconfig eth0; echo $GLOBUS_HOSTNAME; ./demompi.sh
eth0 Link encap:Ethernet HWaddr 00:14:5E:9C:0D:82
inet addr:172.17.5.144 Bcast:172.31.255.255 Mask:255.240.0.0
inet6 addr: fe80::214:5eff:fe9c:d82/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:88471902 errors:0 dropped:54 overruns:0 frame:222
TX packets:84299690 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:67184589504 (64072.2 Mb) TX bytes:69406208974 (66190.9 Mb)
Interrupt:33
172.17.5.144
Swift svn swift-r3186 cog-r2577
RunID: run0
Progress:
Progress: Stage in:1
Progress: Submitted:1
Progress: Active:1
Progress: Active:1
Progress: Active:1
Progress: Active:1
Progress: Active:1
Progress: Active:1
Progress: Active:1
Progress: Checking status:1
Failed to transfer wrapper log from mpitest-run0/info/a on INTREPID
Progress: Submitted:1
Progress: Active:1
Progress: Active:1
...
...
I also set it using the "env" namespace:
<profile namespace="env" key="GLOBUS_HOSTNAME">172.17.5.144</profile>
Yet it doesn't seem to be reflected in the cobalt logs:
workdir$grep GLOBUS *.cobaltlog
$
thanks,
-Allan
2009/11/3 Mihael Hategan <hategan at mcs.anl.gov>:
> Make sure you set GLOBUS_HOSTNAME to the IP of eth0 before running.
>
> On Tue, 2009-11-03 at 16:56 -0600, Allan Espinosa wrote:
>> Hi,
>>
>> I'm using a cobalt-only sites.xml to launch MPI jobs from the
>> BlueGene. But when i inspected the workdir, no job directories were
>> created
>>
>> swift session:
>> Swift svn swift-r3186 cog-r2577
>>
>> RunID: run0
>> Progress:
>> Progress: Submitted:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> 1Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Checking status:1
>> Failed to transfer wrapper log from mpitest-run0/info/p on INTREPID
>> Progress: Submitted:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Checking status:1
>> Failed to transfer wrapper log from mpitest-run0/info/r on INTREPID
>> Progress: Submitted:1
>> Progress: Submitted:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Active:1
>> Progress: Checking status:1
>> Failed to transfer wrapper log from mpitest-run0/info/t on INTREPID
>> Execution failed:
>> Exception in hello:
>> Arguments: []
>> Host: INTREPID
>> Directory: mpitest-run0/jobs/t/hello-tlf9cyij
>> stderr.txt:
>>
>> stdout.txt:
>>
>> ----
>>
>> Caused by:
>> No status file was found. Check the shared filesystem on INTREPID
>>
>> listing of workdir:
>> intrepid-fs0/users/espinosa/scratch/mpi_runs/mpitest-run0> find .
>> .
>> ./shared
>> ./shared/_swiftwrap
>> ./shared/_swiftseq
>> ./kickstart
>> ./status
>> ./info
>> ./200173.cobaltlog
>> ./200174.cobaltlog
>> ./200175.cobaltlog
>> ./200176.cobaltlog
>> ./200177.cobaltlog
>>
>>
>> sites.xml:
>> <config>
>> <pool handle="INTREPID">
>> <filesystem provider="local" />
>> <execution provider="cobalt"/>
>> <profile namespace="globus" key="hostCount">64</profile>
>> <profile namespace="globus" key="project">HTCScienceApps</profile>
>> <profile namespace="globus" key="maxtime">20</profile>
>> <profile namespace="globus" key="mode">vn</profile>
>> <profile namespace="globus" key="queue">prod-devel</profile>
>> <workdirectory >/intrepid-fs0/users/espinosa/scratch/mpi_runs</workdirectory>
>> </pool>
>> </config>
>>
>>
>> Where do you think these jobdirs were created? I have also attached
>> the swift log in this email.
>>
>> -Allan
More information about the Swift-user
mailing list