[Swift-user] Passing hostType for MPI jobs

Andriy Fedorov fedorov at cs.wm.edu
Tue Jul 1 08:39:28 CDT 2008


Hi,

I am having problems passing host type for MPI jobs. This appears to
happen both when I am using globusrun-ws (XML job description),
although the errors are different.

I am trying to request nodes of type "compute" on UC TeraGrid site.
This host type is recognized by PBS when I pass it to "qsub".

Basically, when I am using XML job description, I am specifying
hostType using Job description extension support
(http://www.globus.org/toolkit/docs/4.0/execution/wsgram/WS_GRAM_Job_Desc_Extensions.html#r-wsgram-extensions-constructs-nodes).
What happens is that I get the correct type of nodes, but the count is
not what I request.

When I specify hostType parameter in tc.data I either get an error
(when I have hostCount="4:compute"):

 ===>

RunID: 20080701-0829-xstp5l98
Progress:
hello_mpi started
Progress:  Stage in:1
Failed to transfer wrapper log from
hello_mpi_swift-20080701-0829-xstp5l98/info/a/UC-GT4
Failed to transfer wrapper log from
hello_mpi_swift-20080701-0829-xstp5l98/info/b/UC-GT4
Failed to transfer wrapper log from
hello_mpi_swift-20080701-0829-xstp5l98/info/c/UC-GT4
hello_mpi failed
Execution failed:
        Exception in hello_mpi:
Arguments: []
Host: UC-GT4
Directory: hello_mpi_swift-20080701-0829-xstp5l98/jobs/c/hello_mpi-cltmnvui
stderr.txt:

stdout.txt:

----

Caused by:
        For input string: "4:compute"

<===

or I get the nodes of the wrong type (when I use hostType="compute" --
looks like it is just ignored).

Does anyone know how to specify host type correctly? Is this a GT4
bug? I suspect there is a GT4 bug involved, because when I skip
<extensions>, I can correctly run MPI job on 4 hosts. I don't know
what is the Swift support for host type functionality.

For the reference, I attach my XML job description, tc.data,
sites.xml, Swift script, and the simple MPI "hello world" code.

hello_mpi.c (compile with `mpicc -o hello_mpi hello_mpi.c') ==>

#include <mpi.h>
#include <stdio.h>

int main(int argc, char **argv){
        int myrank, size;

        MPI_Init(&argc, &argv);
        MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
        MPI_Comm_size(MPI_COMM_WORLD, &size);
        fprintf(stderr, "Hello, world from cpu %i (total %i)\n",
            myrank, size);
        MPI_Finalize();
        return 0;
}
<===

hello_mpi_xml.xml ===>
<job>
        <factoryEndpoint
xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job"
xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing">
                <wsa:Address>

https://tg-grid.uc.teragrid.org:8443/wsrf/services/ManagedJobFactoryService
                </wsa:Address>

                <wsa:ReferenceProperties>
                <gram:ResourceID>PBS</gram:ResourceID>
                </wsa:ReferenceProperties>
        </factoryEndpoint>

        <executable>/home/fedorov/local/bin/hello_mpi</executable>

        <stdout>/home/fedorov/scratch/hello_mpi_xml.stdout</stdout>
        <stderr>/home/fedorov/scratch/hello_mpi_xml.stderr</stderr>

        <count>4</count>
        <hostCount>4</hostCount>
        <maxWallTime>10</maxWallTime>
        <jobType>mpi</jobType>

        <extensions>
        <resourceAllocationGroup>
        <hostType>compute</hostType>
        </resourceAllocationGroup>
        </extensions>
</job>
<===

hello_mpi_swift.swift ===>
type messagefile {}

(messagefile t) greeting() {
    app {
        hello_mpi stderr=@filename(t);
    }
}

messagefile outfile <"hello_mpi.txt">;

outfile = greeting();
<===

tc.data ===>
UC-GT4  hello_mpi       /home/fedorov/local/bin/hello_mpi_v INSTALLED
INTEL32::LINUX GLOBUS::hostCount="4",jobType=mpi,maxWallTime="10",count="4",hostType="compute"
<===

sites.xml ===>
<pool handle="UC-GT4">
        <gridftp url="gsiftp://tg-gridftp.uc.teragrid.org" />
        <execution provider="gt4" jobmanager="PBS"
        url="https://tg-grid.uc.teragrid.org:8443/wsrf/services/ManagedJobFactoryService"
/>
        <workdirectory>/home/fedorov/scratch</workdirectory>
</pool>
<===



More information about the Swift-user mailing list