[mpich-discuss] Fix for Hydra LSF resources query
Pavan Balaji
balaji at mcs.anl.gov
Mon Aug 2 10:25:20 CDT 2010
The fix is committed in r6972
[http://trac.mcs.anl.gov/projects/mpich2/changeset/6972]. The latest
nightly snapshot
[http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/nightly/trunk]
should have it.
-- Pavan
On 08/01/2010 03:19 PM, Pavan Balaji wrote:
> Hello,
>
> Thanks for the patch. I don't think duplicating the string makes much
> difference from a correctness perspective, since the environment
> propagation mechanism is completely different in Hydra, and reads the
> environment before modifying it. However, I do agree that it's a better
> programming practice in general to not corrupt the environment. I've
> fixed it in my local git repository and will commit it to the svn as
> soon as it's up.
>
> -- Pavan
>
> On 07/30/2010 08:02 PM, Yauheni Zelenko wrote:
>> Hi!
>>
>> I want to propose fix for HYDT_bscd_lsf_query_node_list(). Currently tokenizer operates directly on environment and corrupt it (application gets truncated LSB_MCPU_HOSTS). So it's necessary to create copy of environment and work with it.
>>
>> I suggest next implementation (based on r6885):
>>
>> HYD_status HYDT_bscd_lsf_query_node_list(struct HYD_node **node_list)
>> {
>> char *hosts;
>> HYD_status status = HYD_SUCCESS;
>>
>> HYDU_FUNC_ENTER();
>>
>> if (MPL_env2str("LSB_MCPU_HOSTS", (const char **)&hosts) == 0)
>> hosts = NULL;
>>
>> if (hosts == NULL) {
>> *node_list = NULL;
>> HYDU_ERR_SETANDJUMP(status, HYD_INTERNAL_ERROR, "No LSF node list found\n");
>> }
>> else {
>> char* hosts_copy = HYDU_strdup(hosts);
>> char* hostname = strtok(hosts_copy, " ");
>>
>> while (1) {
>> char* num_procs_str;
>> int num_procs;
>>
>> if (hostname == NULL)
>> break;
>>
>> /* the even fields in the list should be the number of
>> * cores */
>> num_procs_str = strtok(NULL, " ");
>> HYDU_ASSERT(num_procs_str, status);
>>
>> num_procs = atoi(num_procs_str);
>>
>> status = HYDU_add_to_node_list(hostname, num_procs, node_list);
>> HYDU_ERR_POP(status, "unable to add to node list\n");
>>
>> hostname = strtok(NULL, " ");
>> }
>> HYDU_free(hosts_copy);
>> }
>>
>> fn_exit:
>> HYDU_FUNC_EXIT();
>> return status;
>>
>> fn_fail:
>> goto fn_exit;
>> }
>>
>> Please review it and include to Hydra code.
>>
>> Eugene.
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list