[Mochi-devel] running mochi examples

Dorier, Matthieu mdorier at anl.gov
Wed Nov 13 08:30:26 CST 2019


Hi,

I’m not sure we are referring to the same problems; 10 days ago the problem was that Mercury wasn’t finding the proper underlying transport, and was displaying NA errors. Later the problem seemed to be a linking one, during compilation.

I may have missed a few steps. What is the current problem, right now?

Thanks,

Matthieu

From: mochi-devel <mochi-devel-bounces at lists.mcs.anl.gov> on behalf of Abid Malik via mochi-devel <mochi-devel at lists.mcs.anl.gov>
Reply to: Abid Malik <abidmuslim at gmail.com>
Date: Wednesday, 13 November 2019 at 13:58
To: "Latham, Robert J." <robl at mcs.anl.gov>
Cc: "mochi-devel at lists.mcs.anl.gov" <mochi-devel at lists.mcs.anl.gov>
Subject: Re: [Mochi-devel] running mochi examples

Thanks, Rob.
I tried both login and compute nodes ( through interactive mode). I feel that there is something wrong in my environment. I am talking to the IT help people and will get back to you.

Abid

On Tue, Nov 12, 2019 at 3:33 PM Latham, Robert J. <robl at mcs.anl.gov<mailto:robl at mcs.anl.gov>> wrote:
On Tue, 2019-11-12 at 13:42 +0000, Carns, Philip H. via mochi-devel
wrote:
> Hi Abid,
>
> It still sounds like your libfabric build does not include the verbs
> and rxm providers.  If you want to confirm this, then load the
> libfabric package and (on a compute node, within a job script) run
> "fi_info" and save the output.

OH! Abid, are you running from the login nodes?  that's not going to
work.

A more succint way to get the information Phil requested: 'fi_info -l'
will list the providers it knows:

Here's the output from the summit login node:

% fi_info -l
ofi_rxm:
    version: 1.0
shm:
    version: 1.1
ofi_perf_hook:
    version: 1.0
ofi_noop_hook:
    version: 1.0
ofi_mrail:
    version: 1.0

Summit login node doesn't have any infiband cards, so it does not
advertise the infiniband (verbs) provider.

But how about from a compute node?  Here's what I see:

% bsub -W 0:10 -nnodes 2 -P CSC332 -Is /bin/bash
Job <732804> is submitted to default queue <batch>.
<<Waiting for dispatch ...>>
<<Starting on batch1>>
robl at batch1:$ fi_info -l
verbs:
    version: 1.0
ofi_rxm:
    version: 1.0
shm:
    version: 1.1
ofi_perf_hook:
    version: 1.0
ofi_noop_hook:
    version: 1.0
ofi_mrail:
    version: 1.0

The rest of Phil's message is spot-on -- if you don't see 'verbs' from
the compute node, you'll have to rebuild libfabric.


==rob


--
Abid M. Malik
******************************************************
"I have learned silence from the talkative, toleration from the intolerant, and kindness from the unkind"---Gibran
"Success is not for the chosen few, but for the few who choose" --- John Maxwell
"Being a good person does not depend on your religion or status in life, your race or skin color, political views or culture. IT DEPENDS ON HOW GOOD YOU TREAT OTHERS"--- Abid
"The Universe is talking to us, and the language of the Universe is mathematics."----Abid

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mochi-devel/attachments/20191113/c4c1cbd4/attachment-0001.html>


More information about the mochi-devel mailing list