[Mochi-devel] running mochi examples

Carns, Philip H. carns at mcs.anl.gov
Tue Nov 12 07:42:33 CST 2019


Hi Abid,

It still sounds like your libfabric build does not include the verbs and rxm providers.  If you want to confirm this, then load the libfabric package and (on a compute node, within a job script) run "fi_info" and save the output.

The output should include information about "verbs" and "rxm" interfaces in it.

If it does not, then you should uninstall libfabric and all Mochi packages that depend on it ("spack uninstall --all --dependents libfabric"), place the packages.yaml that Rob Latham provided in ~/.spack/linux/, install your software again, and repeat the above experiment.

The default libfabric build will not include the network support that you need for Summit; you really need to use Rob's packages.yaml not only to get the right transports but to have it use the right verbs library that's provided in the Summit environment.

thanks!
-Phil
________________________________
From: Abid Malik <abidmuslim at gmail.com>
Sent: Tuesday, November 12, 2019 7:53 AM
To: Carns, Philip H. <carns at mcs.anl.gov>
Cc: Dorier, Matthieu <mdorier at anl.gov>; Latham, Robert J. via mochi-devel <mochi-devel at lists.mcs.anl.gov>
Subject: Re: [Mochi-devel] running mochi examples

Thank Mathieu and Philip,

The messages are the same as appeared on the issue. Please let me know if you want me to build Mochi or run some examples in a way that can help to solve the problem.



On Tue, Nov 12, 2019 at 7:46 AM Carns, Philip H. <carns at mcs.anl.gov<mailto:carns at mcs.anl.gov>> wrote:
Abid followed up some more off-list; I think his code building now but he still has some run time problems to work through.

I have a quick follow up on one of the tangents in this discussion; the issue that was preventing "verbs://" from working correctly as the transport name has been fixed (you will need to uninstall and then re-install the mercury package to pick this up):

https://github.com/mercury-hpc/mercury/issues/322

That wasn't the core problem here, but something confusing that we hit along the way.

thanks,
-Phil
________________________________
From: Dorier, Matthieu <mdorier at anl.gov<mailto:mdorier at anl.gov>>
Sent: Sunday, November 10, 2019 3:06 PM
To: Abid Malik <abidmuslim at gmail.com<mailto:abidmuslim at gmail.com>>
Cc: Carns, Philip H. <carns at mcs.anl.gov<mailto:carns at mcs.anl.gov>>; Latham, Robert J. via mochi-devel <mochi-devel at lists.mcs.anl.gov<mailto:mochi-devel at lists.mcs.anl.gov>>
Subject: Re: [Mochi-devel] running mochi examples


Hi Abid,



Sorry for the late response, I’m on vacations with limited access to my computer.

Let’s try to figure out what changed since your earlier messages about NA errors. Which libraries were rebuilt?



Thanks,



Matthieu



From: Abid Malik <abidmuslim at gmail.com<mailto:abidmuslim at gmail.com>>
Date: Wednesday, 6 November 2019 at 17:56
To: "Dorier, Matthieu" <mdorier at anl.gov<mailto:mdorier at anl.gov>>
Cc: "Carns, Philip H." <carns at mcs.anl.gov<mailto:carns at mcs.anl.gov>>, "Latham, Robert J. via mochi-devel" <mochi-devel at lists.mcs.anl.gov<mailto:mochi-devel at lists.mcs.anl.gov>>
Subject: Re: [Mochi-devel] running mochi examples



Hi Matthieu,



For simple cases, I am using



gcc test.server.c -o server $(pkg-config --libs margo)



I pretty much tried everything. I just installed libfabric manually. I am not sure what to try  next.



On Wed, Nov 6, 2019 at 12:07 PM Dorier, Matthieu <mdorier at anl.gov<mailto:mdorier at anl.gov>> wrote:

Hi Abid,



At this top of my head this could be a problem with the order in which libraries are linked. Could you send your CMakeLists.txt file?

Thanks,



Matthieu



From: mochi-devel <mochi-devel-bounces at lists.mcs.anl.gov<mailto:mochi-devel-bounces at lists.mcs.anl.gov>> on behalf of Abid Malik via mochi-devel <mochi-devel at lists.mcs.anl.gov<mailto:mochi-devel at lists.mcs.anl.gov>>
Reply to: Abid Malik <abidmuslim at gmail.com<mailto:abidmuslim at gmail.com>>
Date: Tuesday, 5 November 2019 at 00:01
To: "Carns, Philip H." <carns at mcs.anl.gov<mailto:carns at mcs.anl.gov>>
Cc: "Latham, Robert J. via mochi-devel" <mochi-devel at lists.mcs.anl.gov<mailto:mochi-devel at lists.mcs.anl.gov>>
Subject: Re: [Mochi-devel] running mochi examples





Thanks.



I am getting the following linking error:





Linking C executable 01_margo_hello_server
/usr/bin/ld: CMakeFiles/01_margo_hello_server.dir/hello_server.c.o: undefined reference to symbol 'ABT_thread_create'
/gpfs/alpine/csc299/world-shared/amalik/mochi/spack/opt/spack/linux-rhel7-power9le/gcc-6.4.0/argobots-develop-uwvdjjdzrkn475ax7bw4w7hztawzidg3/lib/libabt.so.0: error adding symbols: DSO missing from command line
/usr/bin/sha1sum: 01_margo_hello_server: No such file or directory
collect2: error: ld returned 1 exit status
make[2]: *** [01_margo_hello_server] Error 1
make[1]: *** [CMakeFiles/01_margo_hello_server.dir/all] Error 2

make: *** [all] Error 2



currently, I have the following:



Currently Loaded Modules:
  1) xl/16.1.1-3                      6) darshan-runtime/3.1.7               11) boost-1.70.0-gcc-6.4.0-dw376xg
  2) spectrum-mpi/10.3.0.1-20190611   7) DefApps                             12) libfabric-1.8.1-gcc-6.4.0-d4kxa4p
  3) hsi/5.0.2.p5                     8) argobots-develop-gcc-6.4.0-uwvdjjd  13) mercury-master-gcc-6.4.0-nbp34rm
  4) xalt/1.1.4                       9) bzip2-1.0.8-gcc-6.4.0-mdsmice       14) margo-0.5.2-gcc-6.4.0-gniufvt
  5) lsf-tools/2.0                   10) zlib-1.2.11-gcc-6.4.0-nbedgs2       15) cmake/3.15.2



Argobot is there. Am I missing something?





Abid







On Mon, Nov 4, 2019 at 5:10 PM Carns, Philip H. <carns at mcs.anl.gov<mailto:carns at mcs.anl.gov>> wrote:

For the transport it should work Ok to just use "verbs://" now (the trailing colon and slashes are important, though).   Mercury will automatically convert that to the more verbose string internally.



thanks,

-Phil

________________________________

From: mochi-devel <mochi-devel-bounces at lists.mcs.anl.gov<mailto:mochi-devel-bounces at lists.mcs.anl.gov>> on behalf of Latham, Robert J. via mochi-devel <mochi-devel at lists.mcs.anl.gov<mailto:mochi-devel at lists.mcs.anl.gov>>
Sent: Monday, November 4, 2019 3:11 PM
To: mochi-devel at lists.mcs.anl.gov<mailto:mochi-devel at lists.mcs.anl.gov> <mochi-devel at lists.mcs.anl.gov<mailto:mochi-devel at lists.mcs.anl.gov>>; abidmuslim at gmail.com<mailto:abidmuslim at gmail.com> <abidmuslim at gmail.com<mailto:abidmuslim at gmail.com>>
Subject: Re: [Mochi-devel] running mochi examples



On Mon, 2019-11-04 at 07:37 -0500, Abid Malik via mochi-devel wrote:
>
> Hello,
>
> I am trying to play with the mochi examples at
>
>   https://xgitlab.cels.anl.gov/sds/sds-examples.git
>
> I am trying the example on summit:
>
> /mercury/01_hello/01_hg_hello_server
>
> How should I start the server/client? I am trying it on login2 node
> and trying to initiate it like:
>
> ./01_hg_hello_server bmi+tcp://login2:1234

I see you are using spack.  If so, I think you would have had to go out
of your way to request the 'bmi' transport.  You probably built
libfabric.  You can find out for sure with 'spack find'

You tell margo (mercury, really) which protocol to use but the string
for "libfabric using the verbs provider" is a little odd:

"ofi+verbs;ofi_rxm://"

> and getting errors.
>
> # NA -- Error -- /tmp/amalik/spack-stage/spack-stage-mercury-master-
> nbp34rmg6rkc2ayiqxxxuupamvmdusr7/spack-src/src/na/na.c:281
>  # NA_Initialize_opt(): Specified class name does not support
> requested protocol

This says mercury did not recognize 'bmi+tcp'.  You'd get the same
error if you requested "moon_rock".

I don't have a lot of guidance to offer you (yet).  I just got started
on summit a few days ago.  When I try to take the advice I gave you I
get an error, too:

$ jsrun -r1 -n1 ./margo-example-server  verbs
# NA -- Error -- /tmp/robl/spack-stage/spack-stage-mercury-master-
c6nwqortoqjsyviyinlg5t6sfay6wv7l/spack-src/src/na/na.c:279
 # NA_Initialize_opt(): Specified class name does not support requested
protocol

That's what we expect.  margo/mercury don't know what 'verbs'
is.  Instead, we need to tell margo/mercury to use OFI (libfabric):

$ jsrun -r1 -n1 ./margo-example-server "ofi+verbs;ofi_rxm://"
# NA -- Error -- /tmp/robl/spack-stage/spack-stage-mercury-master-
c6nwqortoqjsyviyinlg5t6sfay6wv7l/spack-src/src/na/na_ofi.c:1406
 # na_ofi_getinfo(): fi_getinfo() failed, rc: -61(No data available)

Uh, I'm still working on that one...

==rob

_______________________________________________
mochi-devel mailing list
mochi-devel at lists.mcs.anl.gov<mailto:mochi-devel at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/mochi-devel
https://www.mcs.anl.gov/research/projects/mochi




--

Abid M. Malik
******************************************************
"I have learned silence from the talkative, toleration from the intolerant, and kindness from the unkind"---Gibran
"Success is not for the chosen few, but for the few who choose" --- John Maxwell
"Being a good person does not depend on your religion or status in life, your race or skin color, political views or culture. IT DEPENDS ON HOW GOOD YOU TREAT OTHERS"--- Abid
"The Universe is talking to us, and the language of the Universe is mathematics."----Abid





--

Abid M. Malik
******************************************************
"I have learned silence from the talkative, toleration from the intolerant, and kindness from the unkind"---Gibran
"Success is not for the chosen few, but for the few who choose" --- John Maxwell
"Being a good person does not depend on your religion or status in life, your race or skin color, political views or culture. IT DEPENDS ON HOW GOOD YOU TREAT OTHERS"--- Abid
"The Universe is talking to us, and the language of the Universe is mathematics."----Abid



--
Abid M. Malik
******************************************************
"I have learned silence from the talkative, toleration from the intolerant, and kindness from the unkind"---Gibran
"Success is not for the chosen few, but for the few who choose" --- John Maxwell
"Being a good person does not depend on your religion or status in life, your race or skin color, political views or culture. IT DEPENDS ON HOW GOOD YOU TREAT OTHERS"--- Abid
"The Universe is talking to us, and the language of the Universe is mathematics."----Abid

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mochi-devel/attachments/20191112/04053868/attachment-0001.html>


More information about the mochi-devel mailing list