[Mochi-devel] margo_bulk_transfer() not working on sm transport

Carns, Philip H. carns at mcs.anl.gov
Mon Feb 10 15:11:24 CST 2020


Hi Pradeep,

The key error in that backtrace is this one: "na_sm_put(): process_vm_writev() failed (Operation not permitted)"

This means that your processes don't have sufficient permission to access each other's memory.  This permission is usually set automatically on commercial Linux clusters, but if you are on a laptop or a home-grown system, you might need to do this (as root) on the node:

echo 0 > /proc/sys/kernel/yama/ptrace_scope

That setting will be lost on reboot, but there is probably a way to make it persistent; I just don't know off the top of my head.  At any rate you can try it by just doing the above before deciding if you want to make that change permanent.

thanks,
-Phil
________________________________
From: mochi-devel <mochi-devel-bounces at lists.mcs.anl.gov> on behalf of Pradeep Subedi <ps917 at ored.rutgers.edu>
Sent: Monday, February 10, 2020 4:06 PM
To: mochi-devel at lists.mcs.anl.gov <mochi-devel at lists.mcs.anl.gov>
Subject: [Mochi-devel] margo_bulk_transfer() not working on sm transport

Hi,



I am currently working on developing a mochi service provider. The provider I developed works on tcp and sockets transport layer, but when I use sm transport, I get errors.



On the client side the handle is created as:

hret = margo_bulk_create(provider->client->mid, 1, (void**)(&data), &rdma_size,

                            HG_BULK_WRITE_ONLY, &in.handle);

On the server side the handle is created as

    hret = margo_bulk_create(mid, 1, (void**)&buffer, &size,

                HG_BULK_READ_ONLY, &bulk_handle);



The  server fails on :

hret = margo_bulk_transfer(mid, HG_BULK_PUSH, info->addr, in.handle, 0,

            bulk_handle, 0, size);



with following errors:

# NA -- Error -- /tmp/pradsubedi/spack-stage/spack-stage-mercury-master-6hruzald7l6bjf67o6nptv5inrqk2h3z/spack-src/src/na/na_sm.c:3741

# na_sm_put(): process_vm_writev() failed (Operation not permitted)

# HG -- Error -- /tmp/pradsubedi/spack-stage/spack-stage-mercury-master-6hruzald7l6bjf67o6nptv5inrqk2h3z/spack-src/src/mercury_bulk.c:829

# hg_bulk_transfer_pieces(): Could not transfer data (NA_PROTOCOL_ERROR)

# HG -- Error -- /tmp/pradsubedi/spack-stage/spack-stage-mercury-master-6hruzald7l6bjf67o6nptv5inrqk2h3z/spack-src/src/mercury_bulk.c:988

# hg_bulk_transfer(): Could not transfer data pieces

# HG -- Error -- /tmp/pradsubedi/spack-stage/spack-stage-mercury-master-6hruzald7l6bjf67o6nptv5inrqk2h3z/spack-src/src/mercury_bulk.c:1829

# HG_Bulk_transfer_id(): Could not start transfer of bulk data



Is this a known issue with shared memory transport for margo_bulk_transfer or should I be doing something different?



Thanks,

Pradeep Subedi



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mochi-devel/attachments/20200210/a0f874bf/attachment-0001.html>


More information about the mochi-devel mailing list