[mpich-discuss] Fix for Hydra/blaunch in MPICH2-1.3

Yauheni Zelenko zelenko at cadence.com
Thu Jul 22 19:03:55 CDT 2010


Hi!

I'd like to suggest fix for Hydra when it is used on Platform LSF cluster. rsh/ssh are disabled there by default in such clusters and Platform LSF blaunch should be used instead.

However default blaunch behavior is not compatible with Hydra standard streams handling (application just hangs, and output also missed). To fix this issue blaunch should be run with "-n" command line option. Platform is aware of this issue and probably will fix blaunch in next versions. 

I introduced blaunch bootstrap option which is subset of ssh bootstrap.

Fix also use LSF_BINDIR (set by Platform LSF) to find path to blaunch. My code is based on MPICH2-1.3a2.

Please review my changes and include them into main code base. I may be not aware of useful utilities functions to work with paths.

In bsci_init.c:

from:

    if (!strcmp(bootstrap, "rsh") || !strcmp(bootstrap, "fork"))
        bootstrap = "ssh";

to:

    if (!strcmp(bootstrap, "rsh") || !strcmp(bootstrap, "fork") || !strcmp(bootstrap, "blaunch"))
        bootstrap = "ssh";

ssh_launch.c:

from:

    if (!strcmp(HYDT_bsci_info.bootstrap, "ssh")) {
        if (!path)
            path = HYDU_find_full_path("ssh");
        if (!path)
            path = HYDU_strdup("/usr/bin/ssh");
    }
    else {
        if (!path)
            path = HYDU_find_full_path("rsh");
        if (!path)
            path = HYDU_strdup("/usr/bin/rsh");
    }

    idx = 0;
    targs[idx++] = HYDU_strdup(path);

    /* Allow X forwarding only if explicitly requested */
    if (!strcmp(HYDT_bsci_info.bootstrap, "ssh")) {
        if (HYDT_bsci_info.enablex == 1)
            targs[idx++] = HYDU_strdup("-X");
        else if (HYDT_bsci_info.enablex == 0)
            targs[idx++] = HYDU_strdup("-x");
        else    /* default mode is disable X */
            targs[idx++] = HYDU_strdup("-x");
    }

to:

    if (!strcmp(HYDT_bsci_info.bootstrap, "ssh")) {
        if (!path)
            path = HYDU_find_full_path("ssh");
        if (!path)
            path = HYDU_strdup("/usr/bin/ssh");
    }
    else if (!strcmp(HYDT_bsci_info.bootstrap, "blaunch")) {
	char* BinDirPath;

	MPL_env2str("LSF_BINDIR", (const char **) &BinDirPath);
	if (BinDirPath) {
	    int BinDirLength = strlen(BinDirPath);

	    if (BinDirLength > 0) {
		int PathLength = BinDirLength + 1 + strlen("blaunch");

		if (BinDirPath[BinDirLength - 1] != '/')
		    ++PathLength;
		HYDU_MALLOC(path, char*, PathLength * sizeof(char), status);
		strcpy(path, BinDirPath);
		if (BinDirPath[BinDirLength - 1] != '/') {
		    path[BinDirLength] = '/';
		    strcpy(path + BinDirLength + 1, "blaunch");
		}
		else
		    strcpy(path + BinDirLength, "blaunch");
	    }
	}
        if (!path)
	     path = HYDU_find_full_path("blaunch");
    }
    else {
        if (!path)
            path = HYDU_find_full_path("rsh");
        if (!path)
            path = HYDU_strdup("/usr/bin/rsh");
    }

    idx = 0;
    targs[idx++] = HYDU_strdup(path);

    /* Allow X forwarding only if explicitly requested */
    if (!strcmp(HYDT_bsci_info.bootstrap, "ssh")) {
        if (HYDT_bsci_info.enablex == 1)
            targs[idx++] = HYDU_strdup("-X");
        else if (HYDT_bsci_info.enablex == 0)
            targs[idx++] = HYDU_strdup("-x");
        else    /* default mode is disable X */
            targs[idx++] = HYDU_strdup("-x");
    }
    else if (!strcmp(HYDT_bsci_info.bootstrap, "blaunch")) {
        targs[idx++] = HYDU_strdup("-n");
    }

Eugene.


More information about the mpich-discuss mailing list