Git Product home page Git Product logo

nv_peer_memory's Introduction

GPUDirect RDMA

The latest advancement in GPU-GPU communications is GPUDirect RDMA. This new technology provides a direct P2P (Peer-to-Peer) data path between the GPU Memory directly to/from the NVIDIA HCA/NIC devices. This provides a significant decrease in GPU-GPU communication latency and completely offloads the CPU, removing it from all GPU-GPU communications across the network.

Mellanox Product Family

General

MLNX_OFED 2.1 introduces an API between IB CORE to peer memory clients, such as NVIDIA Kepler class GPU's, (e.g. GPU cards), also known as GPUDirect RDMA. It provides access for the HCA to read/write peer memory data buffers, as a result it allows RDMA-based applications to use the peer device computing power with the RDMA interconnect without the need for copying data to host memory.

This capability is supported with Mellanox ConnectX-3 VPI or Connect-IB InfiniBand adapters. It will also seamlessly work using RoCE technology with the Mellanox ConnectX-3 VPI adapters.

This README describes the required steps to completing the installation for the NVIDIA peer memory client with Mellanox OFED.

A kernel module with comparable functionalities has been integrated into the GPU driver, starting from the release R470, under the name nvidia-peermem.

Installation

Starting from version 1.2, nv_peer_mem requires a MLNX_OFED containing a fix for “Peer-direct patch may cause deadlock due to lock inversion" (tracked by the Internal Ref. #2696789).

nv_peer_mem version 1.1 is the last one to support MLNX_OFED LTS 4.9.

Pre-requisites:

  1. NVIDIA compatible driver is installed and up.
  2. MLNX_OFED 5.1 or newer (with the fix of bug #2696789) is installed and up.

Failure to have the proper configuration (as described above) will result in build failure.

For the required NVIDIA driver and other relevant details in that area please check with NVIDIA support.

To build source packages (src.rpm for RPM based OS and tarball for DEB based OS), use the build_module.sh script.

For example, to build on RPM based OS:

$ ./build_module.sh
Building source rpm for nvidia_peer_memory...

Built: /tmp/nvidia_peer_memory-1.3-0.src.rpm

To install run on RPM based OS:
# rpmbuild --rebuild /tmp/nvidia_peer_memory-1.3-0.src.rpm
# rpm -ivh <path to generated binary rpm file>

To build on DEB based OS:

Building debian tarball for nvidia-peer-memory...

Built: /tmp/nvidia-peer-memory_1.3.orig.tar.gz

To install on DEB based OS:
# cd /tmp
# tar xzf /tmp/nvidia-peer-memory_1.3.orig.tar.gz
# cd nvidia-peer-memory-1.3
# dpkg-buildpackage -us -uc
# dpkg -i <path to generated deb files>            

To install run (excluding Ubuntu):

rpmbuild --rebuild <path to srpm>.
rpm -ivh <path to generated binary rpm file.> [On SLES add --nodeps].

To install on Ubuntu run:

dpkg-buildpackage -us -uc
dpkg -i <path to generated deb files.>

(e.g. dpkg -i nvidia-peer-memory_1.3-0_all.deb
      dpkg -i nvidia-peer-memory-dkms_1.3-0_all.deb)

After successful installation:

  1. nv_peer_mem.ko is installed
  2. service file /etc/init.d/nv_peer_mem to be used for start/stop/status for that kernel module was added.
  3. /etc/infiniband/nv_peer_mem.conf to control whether kernel module will be loaded on boot (default is YES) was added.

Notes

To achieve good performance both the NIC and the GPU must physically sit on same i/o root complex, use lspci -tv to make sure that this is the case.

nv_peer_memory's People

Contributors

adrianchiris avatar alaahl avatar drossetti avatar ferasd avatar haggaie avatar ianboyanzhang avatar johnspillernvidia avatar jon-chuang avatar pakmarkthub avatar rleon avatar tzafrir-mellanox avatar yishaih avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nv_peer_memory's Issues

nv_peer_mem NCCL2 nccl-tests fails with: Out of bounds values : 24 FAILED

On two different GPU clusters, nv_peer_mem NCCL2 failed to pass nccl sanity tests.
MVAPICH2-GDR + gdrcopy passwd the tests with the Same HW/SW.

This is related to issue under nccl-tests https://github.com/NVIDIA/nccl-tests/issues/7
Anyone can help?
Thanks.

Configurations:

MPI: OpenMPI 1.8.8/2.1.3/3.0.1
CUDA lib: CUDA 8.0/9.0/9.1
NCCL lib: NCCL 2.0.5/2.1.15
GDR lib: nv_peer_memory master
OFED: MLNX_OFED_LINUX-4.2-1
OS: Ubuntu1604/CentOS7.4
GPU: Kepler K80/Pascal P100
Server: Supermicro 4028-TR/4028-TR2
Topo interconnect: PIX

Driver Version: 390.30

To Reproduce

nccl-tests fail with GDR enabled:

-x NCCL_IB_DISABLE=0 -x NCCL_IB_CUDA_SUPPORT=1 -mca btl_openib_want_cuda_gdr 1

[15:37:58](root):~ # /root/mpi/cuda-9.0/ompi3-cuda/bin/mpirun -v --allow-run-as-root \
-x NCCL_SOCKET_IFNAME=ib0 -x NCCL_DEBUG=1 \
-x NCCL_IB_DISABLE=0 -x NCCL_IB_CUDA_SUPPORT=1 -mca btl_openib_want_cuda_gdr 1 \
-x LD_LIBRARY_PATH=/root/mpi/cuda-9.0/nccl_2.1.15-1+cuda9.0_x86_64/lib:/usr/local/cuda-9.0/lib64 \
-mca btl_openib_if_include mlx5_3:1 \
-np 2 -host clx-mld-45,clx-mld-46 -pernode --oversubscribe \
/root/mpi/cuda-9.0/nccl_2.1.15-1+cuda9.0_x86_64/ompi3tests/all_reduce_perf -b 9 -e 4M -g 1 -c 1 -z 0

nThread 1 nGpus 1 minBytes 9 maxBytes 4194304 step: 1048576(bytes) warmup iters: 5 iters: 20 validation: 1 
# NCCL Tests compiled with NCCL 2.1
# Using devices
#   Rank  0 on clx-mld-45 device  0 [0x04] Tesla P100-PCIE-16GB

#                                                 out-of-place                    in-place
#      bytes             N    type      op     time  algbw  busbw      res     time  algbw  busbw      res
#   Rank  1 on clx-mld-46 device  0 [0x04] Tesla P100-PCIE-16GB
           8             2   float     sum    0.144   0.00   0.00    0e+00    0.015   0.00   0.00    0e+00
     1048584        262146   float     sum    0.212   4.95   4.95    2e+00    0.209   5.02   5.02    2e+00
     2097160        524290   float     sum    0.379   5.53   5.53    2e+00    0.379   5.53   5.53    2e+00
     3145736        786434   float     sum    0.549   5.73   5.73    2e+00    0.548   5.74   5.74    2e+00
 Out of bounds values : 24 FAILED
 Avg bus bandwidth    : 4.06216 

-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[2940,1],0]
  Exit code:    1
--------------------------------------------------------------------------

nccl-tests OK, with GDR disabled:

-x NCCL_IB_DISABLE=0 -x NCCL_IB_CUDA_SUPPORT=0 -mca btl_openib_want_cuda_gdr 1

[15:50:24](root):~/mpi # /root/mpi/cuda-9.0/ompi3-cuda/bin/mpirun -v --allow-run-as-root \
-x NCCL_SOCKET_IFNAME=ib0 -x NCCL_DEBUG=1 \
-x NCCL_IB_DISABLE=0 -x NCCL_IB_CUDA_SUPPORT=0 -mca btl_openib_want_cuda_gdr 1 \
-x LD_LIBRARY_PATH=/root/mpi/cuda-9.0/nccl_2.1.15-1+cuda9.0_x86_64/lib:/usr/local/cuda-9.0/lib64 \
-mca btl_openib_if_include mlx5_3:1 \
-np 2 -host clx-mld-45,clx-mld-46 -pernode --oversubscribe \
/root/mpi/cuda-9.0/nccl_2.1.15-1+cuda9.0_x86_64/ompi3tests/all_reduce_perf -b 9 -e 4M -g 1 -c 1 -z 0

nThread 1 nGpus 1 minBytes 9 maxBytes 4194304 step: 1048576(bytes) warmup iters: 5 iters: 20 validation: 1 
# NCCL Tests compiled with NCCL 2.1
# Using devices
#   Rank  0 on clx-mld-45 device  0 [0x04] Tesla P100-PCIE-16GB

#                                                 out-of-place                    in-place
#      bytes             N    type      op     time  algbw  busbw      res     time  algbw  busbw      res
#   Rank  1 on clx-mld-46 device  0 [0x04] Tesla P100-PCIE-16GB
           8             2   float     sum    0.087   0.00   0.00    0e+00    0.018   0.00   0.00    0e+00
     1048584        262146   float     sum    0.396   2.65   2.65    0e+00    0.394   2.66   2.66    0e+00
     2097160        524290   float     sum    0.772   2.72   2.72    0e+00   25.292   0.08   0.08    0e+00
     3145736        786434   float     sum   27.539   0.11   0.11    0e+00   69.042   0.05   0.05    0e+00
 Out of bounds values : 0 OK
 Avg bus bandwidth    : 1.03398 

To Building the faulty OpenMPI environment:

OpenMPI

cd /root/mpi/cuda-8.0/ompi3.0.1 && \
    rm -fr /root/mpi/cuda-8.0/ompi3.0.1/* && git checkout v3.0.1 && git reset --hard && \
    ./autogen.pl && \
    CC=/usr/bin/gcc CXX=/usr/bin/g++ FC=/usr/bin/gfortran ./configure --with-verbs --with-cuda=/usr/local/cuda-8.0 --prefix=/root/mpi/cuda-8.0/ompi3-cuda && \
    time make -j $(nproc) install

nccl-tests

cd /root/mpi/cuda-9.1/git/nccl-tests && \
    make MPI=1 NCCL_HOME=/root/mpi/cuda-9.1/nccl_2.1.15-1+cuda9.1_x86_64 CUDA_HOME=/usr/local/cuda-9.1 MPI_HOME=/root/mpi/cuda-9.1/ompi1-cuda DST_DIR=/root/mpi/cuda-9.1/nccl_2.1.15-1+cuda9.1_x86_64/ompi1tests -j $(nproc) && \
    make MPI=1 NCCL_HOME=/root/mpi/cuda-9.1/nccl_2.1.15-1+cuda9.1_x86_64 CUDA_HOME=/usr/local/cuda-9.1 MPI_HOME=/root/mpi/cuda-9.1/ompi2-cuda DST_DIR=/root/mpi/cuda-9.1/nccl_2.1.15-1+cuda9.1_x86_64/ompi2tests -j $(nproc) && \
    make MPI=1 NCCL_HOME=/root/mpi/cuda-9.1/nccl_2.1.15-1+cuda9.1_x86_64 CUDA_HOME=/usr/local/cuda-9.1 MPI_HOME=/root/mpi/cuda-9.1/ompi3-cuda DST_DIR=/root/mpi/cuda-9.1/nccl_2.1.15-1+cuda9.1_x86_64/ompi3tests -j $(nproc)

The Same HW/SW and Tests work properly with MVAPICH2-GDR + gdrcopy

nccl-tests OK, with GDR enabled:

-genv NCCL_IB_DISABLE=0 -genv NCCL_IB_CUDA_SUPPORT=1

[16:06:47](root):~/mpi # /opt/mvapich2/gdr/2.3a/mcast/no-openacc/cuda9.0/mofed4.2/mpirun/gnu4.8.5/bin/mpirun \
-genv LD_LIBRARY_PATH=/root/mpi/cuda-9.0/nccl_2.0.5-3+cuda9.0_amd64/lib:/usr/local/cuda-9.0/lib64:/opt/mvapich2/gdr/2.3a/mcast/no-openacc/cuda9.0/mofed4.2/mpirun/gnu4.8.5/lib64 \
-genv MV2_GPUDIRECT_GDRCOPY_LIB=/root/mpi/cuda-9.0/gdr/lib64/libgdrapi.so \
-genv GDRCOPY_ENABLE_LOGGING=1 -genv GDRCOPY_LOG_LEVEL=5 -genv MV2_USE_GPUDIRECT=1 \
-genv NCCL_IB_DISABLE=0 -genv NCCL_IB_CUDA_SUPPORT=1 -genv NCCL_DEBUG=0 -genv NCCL_SOCKET_IFNAME=enp5s0f0 \
-np 2 -host clx-mld-45,clx-mld-46  /root/mpi/cuda-9.0/nccl_2.0.5-3+cuda9.0_amd64/mvapich2tests/all_reduce_perf -b 9 -e 4M -g 4 -c 1 -z 0

nThread 1 nGpus 4 minBytes 9 maxBytes 4194304 step: 1048576(bytes) warmup iters: 5 iters: 20 validation: 1 
# NCCL Tests compiled with NCCL 2.0
# Using devices
#   Rank  0 on clx-mld-45 device  0 [0x04] Tesla P100-PCIE-16GB
#   Rank  1 on clx-mld-45 device  1 [0x06] Tesla P100-PCIE-16GB
#   Rank  2 on clx-mld-45 device  2 [0x07] Tesla P100-PCIE-16GB
#   Rank  3 on clx-mld-45 device  3 [0x08] Tesla P100-PCIE-16GB
#   Rank  4 on clx-mld-46 device  0 [0x04] Tesla P100-PCIE-16GB
#   Rank  5 on clx-mld-46 device  1 [0x06] Tesla P100-PCIE-16GB
#   Rank  6 on clx-mld-46 device  2 [0x07] Tesla P100-PCIE-16GB

#                                                 out-of-place                    in-place
#      bytes             N    type      op     time  algbw  busbw      res     time  algbw  busbw      res
#   Rank  7 on clx-mld-46 device  3 [0x08] Tesla P100-PCIE-16GB
           8             2   float     sum    0.149   0.00   0.00    0e+00    0.151   0.00   0.00    0e+00
     1048584        262146   float     sum    0.308   3.41   5.96    1e-06    0.304   3.45   6.04    1e-06
     2097160        524290   float     sum    0.491   4.27   7.48    1e-06    0.486   4.32   7.56    1e-06
     3145736        786434   float     sum    0.678   4.64   8.12    1e-06    0.678   4.64   8.12    1e-06
 Out of bounds values : 0 OK
 Avg bus bandwidth    : 5.40981 

To Building the workable MVAPICH2 environment:

gdrcopy

cd /root/mpi/cuda-8.0/git/gdrcopy && \
    make PREFIX=/root/mpi/cuda-8.0/gdr CUDA=/usr/local/cuda-8.0 -j $(nproc) all install
cd /root/mpi/cuda-9.0/git/gdrcopy && \
    make PREFIX=/root/mpi/cuda-9.0/gdr CUDA=/usr/local/cuda-9.0 -j $(nproc) all install

nccl-tests

cd /root/mpi/cuda-9.0/git/nccl-tests && \
    make MPI=1 NCCL_HOME=/root/mpi/cuda-9.0/nccl_2.1.15-1+cuda9.0_x86_64 CUDA_HOME=/usr/local/cuda-9.0 MPI_HOME=/opt/mvapich2/gdr/2.3a/mcast/no-openacc/cuda9.0/mofed4.2/mpirun/gnu4.8.5 LIBRARY_PATH=/opt/mvapich2/gdr/2.3a/mcast/no-openacc/cuda9.0/mofed4.2/mpirun/gnu4.8.5/lib64 DST_DIR=/root/mpi/cuda-9.0/nccl_2.1.15-1+cuda9.0_x86_64/mvapich2tests -j $(nproc)

Why is nv_peer_memory severely deteriorating all_reduce_perf result?

I am running benchmark testing using nccl_test. I have 2 nodes, which are connected via RoCE. I have also installed the nv_peer_memory. However, once I turn on GPU Direct RDMA, the all_reduce_perf bandwidth gets dramatically worse than without GPU Direct RDMA. I am aware that GPU PCIe topology matters and that's why I am only using GPU0 on both nodes since GPU0 and the Mellanox HAC are connected to the same CPU.
The GPU topology is
Screen Shot 2019-04-16 at 8 23 46 PM
Without GPU Direct RDMA and just plain RoCE, GPU0 on node 1 <-> GPU0 on node 2
Screen Shot 2019-04-16 at 8 34 58 PM

With GPU Direct RDMA and just plain RoCE, GPU0 on node 1 <-> GPU0 on node 2
Screen Shot 2019-04-16 at 8 31 29 PM

According to this suggested system support, having single CPU in between GPU and the Mellanox HAC will yield worse performance. But I never expected it to be this much worse.

At this point, I am wondering if there is any tool which can help debug nv_peer_mem to make sure it really takes effect? Or maybe there is sth I misconfigured?

Here is the detail about my environment.
Nvidia Tesla V100
CUDA9.0
NCCL2.2.13
OFED4.2-1.2.0
Mellanox MT27710 ConnectX-4Lx
nvidia_peer_memory1.0-8

I notice that the log says that 'No module present for GPU Direct RDMA'. When I check its status, this is what it look like. Is this normal?
Screen Shot 2019-04-16 at 8 52 55 PM

Can this be used without rdma?

Can this library be used to pass GPU mapped memory to something like dpdk such that the nic would write directly to the GPU memory without RDMA? It seems like it should be as simple as giving dpdk a mempool on the GPU, but it's not clear if this library helps with that.

Debian installation command line error

I am using Debian 9 (new stable release since 17th June, 2017). According to the Installation part of README.md:

$ dpkg-buildpackage -us -uc
dpkg-buildpackage: info: source package nvidia-peer-memory
dpkg-buildpackage: info: source version 1.0-4
dpkg-buildpackage: info: source distribution unstable
dpkg-buildpackage: info: source changed by Feras Daoud <[email protected]>
dpkg-buildpackage: info: host architecture amd64
 dpkg-source --before-build nv_peer_memory
 fakeroot debian/rules clean
dh clean --with dkms
dh: Compatibility levels before 9 are deprecated (level 8 in use)
   dh_testdir
   dh_clean
dh_clean: Compatibility levels before 9 are deprecated (level 8 in use)
 dpkg-source -b nv_peer_memory
dpkg-source: error: can't build with source format '3.0 (quilt)': no upstream tarball found at ../nvidia-peer-memory_1.0.orig.tar.{bz2,gz,lzma,xz}
dpkg-buildpackage: error: dpkg-source -b nv_peer_memory gave error exit status 255

A work around: dpkg-buildpackage -us -uc needs to be changed into dpkg-buildpackage -us -uc -b.

centos 7 problem:modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument

[user@bogon ~]$ cd rpmbuild/
[user@bogon rpmbuild]$ cd RPMS/
[user@bogon RPMS]$ cd x86_64/
[user@bogon x86_64]$ ls
nvidia_peer_memory-1.0-7.x86_64.rpm
[user@bogon x86_64]$ rpm -ivh nvidia_peer_memory-1.0-7.x86_64.rpm
error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Permission denied)
[user@bogon x86_64]$ sudo rpm -ivh nvidia_peer_memory-1.0-7.x86_64.rpm
Preparing... ################################# [100%]
Updating / installing...
1:nvidia_peer_memory-1.0-7 ################################# [100%]
modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument

[user@bogon ~]$ cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
[user@bogon ~]$ uname -a
Linux gpu0 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[user@ bogon ~]$ lspci |grep mellanox -i
01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
[user@ bogon ~]$ ofed_info|head -1
MLNX_OFED_LINUX-4.4-1.0.0.0 (OFED-4.4-1.0.0):
[user@ bogon ~]$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument

My system setting:
System: Ubuntu 16.04
CUDA Version: 9.0
GPU Driver Version: 387.26.

I'm trying to install this module for GPU Direct RDMA. But the error occurs when I install sudo dpkg -i nvidia-peer-memory-dkms_1.0-5_all.deb

(Reading database ... 162399 files and directories currently installed.)
Preparing to unpack nvidia-peer-memory-dkms_1.0-5_all.deb ...

-------- Uninstall Beginning --------
Module:  nvidia-peer-memory
Version: 1.0
Kernel:  4.4.0-104-generic (x86_64)
-------------------------------------

Status: Before uninstall, this module version was ACTIVE on this kernel.

nv_peer_mem.ko:
 - Uninstallation
   - Deleting from: /lib/modules/4.4.0-104-generic/updates/dkms/
 - Original module
   - No original module was found for this module on this kernel.
   - Use the dkms install command to reinstall any previous module version.

depmod....

DKMS: uninstall completed.

------------------------------
Deleting module version: 1.0
completely from the DKMS tree.
------------------------------
Done.
Unpacking nvidia-peer-memory-dkms (1.0-5) over (1.0-5) ...
Setting up nvidia-peer-memory-dkms (1.0-5) ...

Creating symlink /var/lib/dkms/nvidia-peer-memory/1.0/source ->
                 /usr/src/nvidia-peer-memory-1.0

DKMS: add completed.

Kernel preparation unnecessary for this kernel.  Skipping...

Building module:
cleaning build area....
make KERNELRELEASE=4.4.0-104-generic all KVER=4.4.0-104-generic KDIR=/lib/modules/4.4.0-104-generic/build....
cleaning build area....

DKMS: build completed.

nv_peer_mem:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/4.4.0-104-generic/updates/dkms/

depmod....

DKMS: install completed.
modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument

If connectX-3 HCA can use ibv_reg_mr to register GPU memory created by cudaMalloc?

yuxin420@luigi:~/TEMP/rdma$ lspci -v | grep Mellanox
01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
	Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3]
82:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
	Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3]

I installed Mellanox OFED to use IB verbs. It works fine if the buffer is on CPU memory. But if I ibv_reg_mr to register a buff on GPU, it fails. If I need to use another verb for registering GPU memory or I should do something setting stuff to enable it?

Thanks!

Yuxin

nvidia_peer_memory-1.0-8 modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument

CentOS Linux release 7.7.1908 (Core)

uname -r

3.10.0-1062.9.1.el7.x86_64

lspci |grep mellanox -i

5e:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
5e:00.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]

ofed_info -s

MLNX_OFED_LINUX-4.7-1.0.0.1:

nvidia-smi

Sat Dec 7 00:07:38 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
...

./build_module.sh

Building source rpm for nvidia_peer_memory...

Built: /tmp/nvidia_peer_memory-1.0-8.src.rpm

To install run on RPM based OS:
# rpmbuild --rebuild /tmp/nvidia_peer_memory-1.0-8.src.rpm
# rpm -ivh

[root@bmlp-c08006:/tmp/nv_peer_memory]# rpmbuild --rebuild /tmp/nvidia_peer_memory-1.0-8.src.rpm
Installing /tmp/nvidia_peer_memory-1.0-8.src.rpm
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.SBGi1I

  • umask 022
  • cd /root/rpmbuild/BUILD
  • cd /root/rpmbuild/BUILD
  • rm -rf nvidia_peer_memory-1.0
  • /usr/bin/gzip -dc /root/rpmbuild/SOURCES/nvidia_peer_memory-1.0.tar.gz
  • /usr/bin/tar -xvvf -
    drwxr-xr-x root/root 0 2019-12-07 00:04 nvidia_peer_memory-1.0/
    drwxr-xr-x root/root 0 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/
    drwxr-xr-x root/root 0 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/patches/
    -rw-r--r-- root/root 369 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/patches/dkms_name.patch
    -rw-r--r-- root/root 16 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/patches/series
    drwxr-xr-x root/root 0 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/source/
    -rw-r--r-- root/root 12 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/source/format
    -rw-r--r-- root/root 1791 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/changelog
    -rw-r--r-- root/root 2 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/compat
    -rw-r--r-- root/root 910 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/control
    -rw-r--r-- root/root 10 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/nvidia-peer-memory-dkms.dkms
    -rw-r--r-- root/root 245 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/nvidia-peer-memory-dkms.postinst
    -rwxr-xr-x root/root 198 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/nvidia-peer-memory.postinst
    -rwxr-xr-x root/root 199 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/nvidia-peer-memory.prerm
    -rwxr-xr-x root/root 1362 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/rules
    -rwxr-xr-x root/root 431 2019-12-07 00:04 nvidia_peer_memory-1.0/debian/updateInit.sh
    -rw-r--r-- root/root 3707 2019-12-07 00:04 nvidia_peer_memory-1.0/Makefile
    -rw-r--r-- root/root 3415 2019-12-07 00:04 nvidia_peer_memory-1.0/README.md
    -rwxr-xr-x root/root 2276 2019-12-07 00:04 nvidia_peer_memory-1.0/build_module.sh
    -rw-r--r-- root/root 5817 2019-12-07 00:04 nvidia_peer_memory-1.0/compat_nv-p2p.h
    -rwxr-xr-x root/root 4031 2019-12-07 00:04 nvidia_peer_memory-1.0/create_nv.symvers.sh
    -rw-r--r-- root/root 614 2019-12-07 00:04 nvidia_peer_memory-1.0/dkms.conf
    -rwxr-xr-x root/root 2756 2019-12-07 00:04 nvidia_peer_memory-1.0/nv_peer_mem
    -rwxr-xr-x root/root 13013 2019-12-07 00:04 nvidia_peer_memory-1.0/nv_peer_mem.c
    -rw-r--r-- root/root 47 2019-12-07 00:04 nvidia_peer_memory-1.0/nv_peer_mem.conf
    -rwxr-xr-x root/root 241 2019-12-07 00:04 nvidia_peer_memory-1.0/nv_peer_mem.upstart
    -rw-r--r-- root/root 3299 2019-12-07 00:04 nvidia_peer_memory-1.0/nvidia_peer_memory.spec
  • STATUS=0
  • '[' 0 -ne 0 ']'
  • cd nvidia_peer_memory-1.0
  • /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
  • exit 0
    Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.wIhXAm
  • umask 022
  • cd /root/rpmbuild/BUILD
  • cd nvidia_peer_memory-1.0
  • export KVER=3.10.0-1062.9.1.el7.x86_64
  • KVER=3.10.0-1062.9.1.el7.x86_64
  • make KVER=3.10.0-1062.9.1.el7.x86_64 all
    /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/create_nv.symvers.sh 3.10.0-1062.9.1.el7.x86_64
    '/lib/modules/3.10.0-1062.9.1.el7.x86_64/extra/nvidia.ko.xz' -> './nvidia.ko.xz'
    Getting symbol versions from nvidia.ko ...
    Created: /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv.symvers
    Found /usr/src/nvidia-440.33.01//nvidia/nv-p2p.h
    /bin/cp -f /usr/src/nvidia-440.33.01//nvidia/nv-p2p.h /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv-p2p.h
    cp -rf /usr/src/ofa_kernel/default/Module.symvers .
    cat nv.symvers >> Module.symvers
    make -C /lib/modules/3.10.0-1062.9.1.el7.x86_64/build M=/root/rpmbuild/BUILD/nvidia_peer_memory-1.0 modules
    make[1]: Entering directory /usr/src/kernels/3.10.0-1062.9.1.el7.x86_64' CC [M] /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.o /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support #pragma message("Enable nvidia_p2p_dma_map_pages support") ^ Building modules, stage 2. MODPOST 1 modules CC /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.mod.o LD [M] /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.ko make[1]: Leaving directory /usr/src/kernels/3.10.0-1062.9.1.el7.x86_64'
  • exit 0
    Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.gQcTGj
  • umask 022
  • cd /root/rpmbuild/BUILD
  • '[' /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64 '!=' / ']'
  • rm -rf /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64
    ++ dirname /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64
  • mkdir -p /root/rpmbuild/BUILDROOT
  • mkdir /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64
  • cd nvidia_peer_memory-1.0
  • export KVER=3.10.0-1062.9.1.el7.x86_64
  • KVER=3.10.0-1062.9.1.el7.x86_64
  • make DESTDIR=/root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64 KVER=3.10.0-1062.9.1.el7.x86_64 install
    mkdir -p /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64//lib/modules/3.10.0-1062.9.1.el7.x86_64/extra/;
    cp -f /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.ko /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64//lib/modules/3.10.0-1062.9.1.el7.x86_64/extra/;
    if [ ! -n "/root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64" ]; then /sbin/depmod -r -ae 3.10.0-1062.9.1.el7.x86_64;fi;
  • install -d /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64/etc/infiniband
  • install -m 0644 /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.conf /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64/etc/infiniband
  • install -d /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64/etc/init.d
  • install -m 0755 /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64/etc/init.d
  • /usr/lib/rpm/check-buildroot
  • /usr/lib/rpm/redhat/brp-compress
  • /usr/lib/rpm/redhat/brp-strip /usr/bin/strip
  • /usr/lib/rpm/redhat/brp-strip-comment-note /usr/bin/strip /usr/bin/objdump
  • /usr/lib/rpm/redhat/brp-strip-static-archive /usr/bin/strip
  • /usr/lib/rpm/brp-python-bytecompile /usr/bin/python 1
  • /usr/lib/rpm/redhat/brp-python-hardlink
  • /usr/lib/rpm/redhat/brp-java-repack-jars
    Processing files: nvidia_peer_memory-1.0-8.x86_64
    Provides: nvidia_peer_memory = 1.0-8 nvidia_peer_memory(x86-64) = 1.0-8
    Requires(interp): /bin/sh /bin/sh
    Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
    Requires(post): /bin/sh
    Requires(preun): /bin/sh
    Requires: /bin/bash
    Checking for unpackaged file(s): /usr/lib/rpm/check-files /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64
    Wrote: /root/rpmbuild/RPMS/x86_64/nvidia_peer_memory-1.0-8.x86_64.rpm
    Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.Cqwtul
  • umask 022
  • cd /root/rpmbuild/BUILD
  • cd nvidia_peer_memory-1.0
  • cd /tmp
  • chmod -R o+w /root/rpmbuild/BUILD/nvidia_peer_memory-1.0
  • rm -rf /root/rpmbuild/BUILD/nvidia_peer_memory-1.0
  • test x/root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64 '!=' x
  • rm -rf /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-8.x86_64
  • exit 0
    Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.ShKItm
  • umask 022
  • cd /root/rpmbuild/BUILD
  • rm -rf nvidia_peer_memory-1.0
  • exit 0

yum install /root/rpmbuild/RPMS/x86_64/nvidia_peer_memory-1.0-8.x86_64.rpm

Loaded plugins: enabled_repos_upload, fastestmirror, langpacks, nvidia, package_upload, product-id, search-disabled-repos, subscription-manager
Examining /root/rpmbuild/RPMS/x86_64/nvidia_peer_memory-1.0-8.x86_64.rpm: nvidia_peer_memory-1.0-8.x86_64
Marking /root/rpmbuild/RPMS/x86_64/nvidia_peer_memory-1.0-8.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package nvidia_peer_memory.x86_64 0:1.0-8 will be installed
--> Finished Dependency Resolution
...
Dependencies Resolved
Package Arch Version Repository Size
Installing:
nvidia_peer_memory x86_64 1.0-8 /nvidia_peer_memory-1.0-8.x86_64 291 k

Transaction Summary

Install 1 Package

Total size: 291 k
Installed size: 291 k
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : nvidia_peer_memory-1.0-8.x86_64 1/1
modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument

/etc/init.d/nv_peer_mem restart

stopping... OK
starting... modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument
Failed to load nv_peer_mem

#dmesg
...
[ 2072.534744] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_unmap_pages
[ 2072.534750] nv_peer_mem: Unknown symbol nvidia_p2p_dma_unmap_pages (err -22)
[ 2072.534767] nv_peer_mem: disagrees about version of symbol nvidia_p2p_get_pages
[ 2072.534768] nv_peer_mem: Unknown symbol nvidia_p2p_get_pages (err -22)
[ 2072.534779] nv_peer_mem: disagrees about version of symbol nvidia_p2p_put_pages
[ 2072.534780] nv_peer_mem: Unknown symbol nvidia_p2p_put_pages (err -22)
[ 2072.534801] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_map_pages
[ 2072.534802] nv_peer_mem: Unknown symbol nvidia_p2p_dma_map_pages (err -22)
[ 2072.534810] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_dma_mapping
[ 2072.534811] nv_peer_mem: Unknown symbol nvidia_p2p_free_dma_mapping (err -22)
[ 2072.534819] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_page_table
[ 2072.534820] nv_peer_mem: Unknown symbol nvidia_p2p_free_page_table (err -22)

GPUDirect RDMA sometimes misses writing a word to remote

Hi, my project uses GPUDirect RDMA to send intermediate computation result on GPU memory to remote CPU memory (via RDMA WRITE). First, the data will be copied to a staging device buffer (the buffer was zeroed out before), which was registered as a MR, then the buffer and msg_sz will be passed to post_send() to send the data to remote.

// prepare RDMA buffer for RDMA-WRITE
char *rdma_buf = gmem->buffer(tid);
GPU_ASSERT( cudaMemcpy(rdma_buf, &data_sz, sizeof(uint64_t), cudaMemcpyHostToDevice) ); // header
rdma_buf += sizeof(uint64_t);
GPU_ASSERT( cudaMemcpy(rdma_buf, data, data_sz, cudaMemcpyDeviceToDevice) );    // data
rdma_buf += roundup(data_sz, sizeof(uint64_t));
GPU_ASSERT( cudaMemcpy(rdma_buf, &data_sz, sizeof(uint64_t), cudaMemcpyHostToDevice) );  // footer

My messaging protocol is same as FaRM, which uses a ring buffer to store messages. And in my case, the ring buffer only has one writer and one reader.
The structure of a message is [ header | payload | footer ], and the size of payload is encoded in header and footer. The problem is sometimes I found some messages' footer was missing in the receiver side, its value becomes 0.

I resort to ibdump to dump the RDMA traffics, and found that the DMA Length in RETH is correct but the footer was missing in the last packet indeed! One thing to notice is that if I copy the data on GPU memory to host memory then send it via normal RDMA (w/o GPUDirect), then everything is ok.
I have no idea why this happened, can you guys give me some hints?

Setup:

  • Ubuntu 16.04
  • Mellanox ConnectX-3 56Gbps
  • NVIDIA Tesla K40m
  • NVIDIA driver version: 384.111
  • CUDA 8
  • nv_peer_mem: 1.0-3
  • MLNX_OFED 4.0-2.0.0.1

make install is broken on RH 7.x

$ uname -r
3.10.0-327.el7.x86_64

$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.2 (Maipo)

$ make install
mkdir -p //lib/modules/3.10.0-327.el7.x86_64/extra/;
cp -f /root/nv_peer_memory/nv_peer_mem.ko //lib/modules/3.10.0-327.el7.x86_64/extra/;
if [ ! -n "" ]; then -r -ae 3.10.0-327.el7.x86_64;fi;
/bin/sh: -r: command not found
make: *** [install] Error 127

Error: nv_peer_mem: Unknown symbol ib_register_peer_memory_client

Hi, i met the same problem as #28

More Information:

# lspci | grep Mell
0000:b3:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
0000:b3:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]

# ofed_info -n
5.0-2.1.8

# ls -l /lib/modules
all 24
drwxr-xr-x  7 root root 4096 11月 24 20:11 3.10.0-1062.18.1.el7.x86_64
drwxr-xr-x  3 root root 4096 4月  22 2020 3.10.0-1062.9.1.el7.x86_64
drwxr-xr-x  3 root root 4096 4月  20 2020 3.10.0-1062.el7.x86_64
drwxr-xr-x  3 root root 4096 12月 25 2019 3.10.0-957.21.3.el7.x86_64
drwxr-xr-x. 3 root root 4096 4月  20 2020 3.10.0-957.el7.x86_64
drwxr-xr-x  6 root root 4096 11月 25 15:41 4.19.95-7

# ls -l /usr/src/ofa_kernel/
all 4
drwxr-xr-x 7 root root 4096 8月   5 16:12 default

Help welcome!

dpkg-buildpackage error

I am following the procedure like below. I got issue when in the step to run "dpkg-buildpackage -us -uc".

Ubuntu 18.04.5 LTS (Bionic Beaver)" Kernel: 5.4.0-45-generic
NVIDIA Driver Version: 455.23.05 CUDA Version: 11.1
MLNX_OFED_LINUX-5.1-2.4.6.0

wget https://www.mellanox.com/sites/default/files/downloads/ofed/nvidia-peer-memory_1.1.tar.gz
then untar it,
cd nvidia_peer_memory-1.1
./build_module.sh
cd /tmp
tar xzf /tmp/nvidia-peer-memory_1.1.orig.tar.gz
cd nvidia-peer-memory-1.1
dpkg-buildpackage -us -uc
dpkg -i

root@xxxx:/tmp/nvidia-peer-memory-1.1# dpkg-buildpackage -us -uc
dpkg-buildpackage: info: source package nvidia-peer-memory
dpkg-buildpackage: info: source version 1.1-0
dpkg-buildpackage: info: source distribution unstable
dpkg-buildpackage: info: source changed by Feras Daoud [email protected]
dpkg-buildpackage: info: host architecture amd64
dpkg-source --before-build nvidia-peer-memory-1.1
debian/rules clean
dh clean --with dkms
dh_clean
dpkg-source -b nvidia-peer-memory-1.1
dpkg-source: info: using source format '3.0 (quilt)'
dpkg-source: info: building nvidia-peer-memory using existing ./nvidia-peer-memory_1.1.orig.tar.gz
patching file dkms.conf
Reversed (or previously applied) patch detected! Skipping patch.
1 out of 1 hunk ignored
dpkg-source: info: the patch has fuzz which is not allowed, or is malformed
dpkg-source: info: if patch 'dkms_name.patch' is correctly applied by quilt, use 'quilt refresh' to update it
dpkg-source: error: LC_ALL=C patch -t -F 0 -N -p1 -u -V never -E -b -B .pc/dkms_name.patch/ --reject-file=- < nvidia-peer-memory-1.1.orig.83fK3z/debian/patches/dkms_name.patch subprocess returned exit status 1
dpkg-buildpackage: error: dpkg-source -b nvidia-peer-memory-1.1 subprocess returned exit status 2

build_module.sh: tar error about ineffective --exclude parameters

On Ubuntu 19.10 (tar 1.30+dfsg-6) I get the following error from build_module.sh:

tar: The following options were used after any non-optional arguments in archive create or update mode.  These options are positional and affect only arguments that follow them.  Please, rearrange them properly.
tar: --exclude .* has no effect
tar: --exclude build_release.sh has no effect
tar: Exiting with failure status due to previous errors

See https://www.gnu.org/software/tar/manual/html_node/Position_002dSensitive-Options.html

create_nv.symvers.sh failed because kernel module name ends with ko.xz instead of .ko

"nm -o $nvidia_mod" in create_nv.symvers.sh is looking for .ko but kernel module names on CentOS 7 end with .ko.xz.
Thus, it failed to get symbol names.

Below was the change I made to work around.

--- create_nv.symvers.sh.new 2018-05-09 10:38:40.033345119 -0700
+++ create_nv.symvers.sh.old 2018-05-09 10:38:08.114218425 -0700
@@ -77,9 +77,6 @@
if [ ! -e "$nvidia_mod" ]; then
continue
fi

  •   cp $nvidia_mod .
    
  •   nvidia_mod=$(echo $nvidia_mod | sed "s/.xz//g" | sed "s/\// /g" |awk '{print $NF}')
    
  •   xz -d ${nvidia_mod}.xz
      if ! (nm -o $nvidia_mod | grep -q "__crc_nvidia_p2p_"); then
              continue
      fi
    

add support for NV dma mappings APIs

New APIs are being added in r361+ drivers. This will allow support for architectures where BARs have bus address != physical address.

APIs are:
typedef struct nvidia_p2p_dma_mapping {�
enum nvidia_p2p_page_size_type page_size_type;�
uint32_t entries;
� uint64_t *dma_addresses;
�} nvidia_p2p_dma_mapping_t;

int nvidia_p2p_dma_map_pages (struct pci_dev _peer,
� struct nvidia_p2p_page_table *page_table,
� struct nvidia_p2p_dma_mapping *_dma_mapping);
int nvidia_p2p_dma_unmap_pages (struct pci_dev *peer,
� struct nvidia_p2p_page_table *page_table,
� struct nvidia_p2p_dma_mapping *dma_mapping);

Given the presence of struct pci_dev, it might be that NVIDIA-related symvers will change on customer machine, depending on kernel version or even on kernel configuration only.
That is why shipping a single symvers file within this project is not a solid solution.
A new mechanism will be needed instead.

POWER9 AC922 GPUDirect 39Gb/s only

Hi,
test system is IBM AC922, 2 x TESLA V100, Connectx-5EN back to back with a x86_64 DELL R740

using RoCEv2 (UC queue pair, WRITE verbs), I have 97Gb/s BW to AC922 CPU memory but only 39 Gb/s to TESLA memory
nv_peer_mem and nv_rsync_mem module are loaded, nvidia-persistenced started
MLNX OFED installed

same issue with either custom code or perftest/ib_write_bw.

observed BW between 2xDELL GPU Quadro P6000 as expected

create_nv.symvers.sh is broken

This change 25774c3#diff-bdbe24543d2311a2bc6b64a3d102fc31L90 returns the wrong symbols' version:

Getting symbol versions from /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia.ko ...
0x000000004c9ba34e  nvidia_p2p_destroy_mapping  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000095397da4  nvidia_p2p_dma_map_pages    /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000043c5682e  nvidia_p2p_dma_unmap_pages  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x00000000adf40bc1  nvidia_p2p_free_dma_mapping /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x000000005868c8aa  nvidia_p2p_free_page_table  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x000000001c254be6  nvidia_p2p_get_pages    /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x00000000d186c986  nvidia_p2p_get_rsync_registers  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x00000000b73bde45  nvidia_p2p_init_mapping /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x000000007e399228  nvidia_p2p_put_pages    /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000051395286  nvidia_p2p_put_rsync_registers  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x000000005d649138  nvidia_p2p_register_rsync_driver    /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x000000009718676b  nvidia_p2p_unregister_rsync_driver  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009c80  nvidia_p2p_destroy_mapping  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009f20  nvidia_p2p_dma_map_pages    /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009b60  nvidia_p2p_dma_unmap_pages  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009160  nvidia_p2p_free_dma_mapping /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009c70  nvidia_p2p_free_page_table  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009470  nvidia_p2p_get_pages    /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009930  nvidia_p2p_get_rsync_registers  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009df0  nvidia_p2p_init_mapping /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009d20  nvidia_p2p_put_pages    /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x00000000000098c0  nvidia_p2p_put_rsync_registers  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009860  nvidia_p2p_register_rsync_driver    /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia
0x0000000000009be0  nvidia_p2p_unregister_rsync_driver  /lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia

And:

$ cat Module.symvers  | egrep nvidia_p2p
0x00009160	nvidia_p2p_free_dma_mapping	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009470	nvidia_p2p_get_pages	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009860	nvidia_p2p_register_rsync_driver	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009f20	nvidia_p2p_dma_map_pages	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009df0	nvidia_p2p_init_mapping	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009be0	nvidia_p2p_unregister_rsync_driver	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009c80	nvidia_p2p_destroy_mapping	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009d20	nvidia_p2p_put_pages	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009930	nvidia_p2p_get_rsync_registers	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009b60	nvidia_p2p_dma_unmap_pages	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x00009c70	nvidia_p2p_free_page_table	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)
0x000098c0	nvidia_p2p_put_rsync_registers	/lib/modules/4.4.0-141-generic/kernel/drivers/video/nvidia	(unknown)

The result is:

[2385461.944989] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_unmap_pages
[2385461.944993] nv_peer_mem: Unknown symbol nvidia_p2p_dma_unmap_pages (err -22)
[2385461.945013] nv_peer_mem: disagrees about version of symbol nvidia_p2p_get_pages
[2385461.945014] nv_peer_mem: Unknown symbol nvidia_p2p_get_pages (err -22)
[2385461.945026] nv_peer_mem: disagrees about version of symbol nvidia_p2p_put_pages
[2385461.945028] nv_peer_mem: Unknown symbol nvidia_p2p_put_pages (err -22)
[2385461.945084] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_map_pages
[2385461.945085] nv_peer_mem: Unknown symbol nvidia_p2p_dma_map_pages (err -22)
[2385461.945096] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_dma_mapping
[2385461.945097] nv_peer_mem: Unknown symbol nvidia_p2p_free_dma_mapping (err -22)
[2385461.945107] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_page_table
[2385461.945109] nv_peer_mem: Unknown symbol nvidia_p2p_free_page_table (err -22)
[2385489.780058] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_unmap_pages
[2385489.780062] nv_peer_mem: Unknown symbol nvidia_p2p_dma_unmap_pages (err -22)
[2385489.780081] nv_peer_mem: disagrees about version of symbol nvidia_p2p_get_pages
[2385489.780082] nv_peer_mem: Unknown symbol nvidia_p2p_get_pages (err -22)
[2385489.780094] nv_peer_mem: disagrees about version of symbol nvidia_p2p_put_pages
[2385489.780096] nv_peer_mem: Unknown symbol nvidia_p2p_put_pages (err -22)
[2385489.780150] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_map_pages
[2385489.780151] nv_peer_mem: Unknown symbol nvidia_p2p_dma_map_pages (err -22)
[2385489.780162] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_dma_mapping
[2385489.780164] nv_peer_mem: Unknown symbol nvidia_p2p_free_dma_mapping (err -22)
[2385489.780174] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_page_table
[2385489.780175] nv_peer_mem: Unknown symbol nvidia_p2p_free_page_table (err -22)

The fix is to revert the commit with: done < <(nm -o $nvidia_mod | grep "__crc_nvidia_p2p_")

ERROR: "nvidia_p2p_*" undefined!

In Debian, we got the error as shown below when trying to install with sudo dpkg -i nvidia-peer-memory-dkms_1.0-8_all.deb.

DKMS make.log for nvidia-peer-memory-1.0 for kernel 5.2.0-050200-generic (x86_64)
Thu Oct  3 07:30:38 HKT 2019
/var/lib/dkms/nvidia-peer-memory/1.0/build/create_nv.symvers.sh 5.2.0-050200-generic
-W- Could not get list of nvidia symbols.
Found /usr/src/nvidia-<omitted>//nvidia/nv-p2p.h
/bin/cp -f /usr/src/nvidia-<omitted>//nvidia/nv-p2p.h /var/lib/dkms/nvidia-peer-memory/1.0/build/nv-p2p.h
cp -rf /usr/src/ofa_kernel/5.2.0-050200-generic/Module.symvers .
cat nv.symvers >> Module.symvers
make -C /lib/modules/5.2.0-050200-generic/build  M=/var/lib/dkms/nvidia-peer-memory/1.0/build modules
make[1]: warning: jobserver unavailable: using -j1.  Add '+' to parent make rule.
make[1]: Entering directory '/usr/src/linux-headers-5.2.0-050200-generic'
  CC [M]  /var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.o
/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support
 #pragma message("Enable nvidia_p2p_dma_map_pages support")
         ^~~~~~~
  Building modules, stage 2.
  MODPOST 1 modules
ERROR: "nvidia_p2p_dma_map_pages" [/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.ko] undefined!
ERROR: "nvidia_p2p_dma_unmap_pages" [/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.ko] undefined!
ERROR: "nvidia_p2p_free_page_table" [/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.ko] undefined!
ERROR: "nvidia_p2p_free_dma_mapping" [/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.ko] undefined!
ERROR: "nvidia_p2p_get_pages" [/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.ko] undefined!
ERROR: "nvidia_p2p_put_pages" [/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[2]: *** [__modpost] Error 1
Makefile:1604: recipe for target 'modules' failed
make[1]: *** [modules] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.2.0-050200-generic'
Makefile:56: recipe for target 'all' failed
make: *** [all] Error 2

In Linux 4., these are shown as WARNING but they have been upgraded to ERROR in Linux 5..

Further investigation shows that ./create_nv.symvers.sh returns -W- Could not get list of nvidia symbols. on Ubuntu. The script fails at line 90 if ! (nm -o $nvidia_mod | grep -q "__crc_nvidia_p2p_"); then because nvidia.ko does not have __crc_nvidia_p2p_.

I believe that this issue occurs when installing NVIDIA driver with dkms support. This issue is not observed on RHEL.

For NVIDIA driver, I tried version 418.39 and newer. I believe that you can use any 418.* to reproduce this bug. The OS I used was Ubuntu 18.04 with Linux 4.15 (got warning) and Linux 5.2 (got error).

MLNX_OFED version

It's ConnectX-4 Lx, Can MLNX_OFED 3.4-1.0.0.0 satisify the requirement ? I see the README.md and
some suggestion they say it needs MLNX_OFED 2.1.

Thanks.

Error occurs in `sudo dpkg -i nvidia-peer-memory-dkms_1.0-1_all.deb`

@rleon

I'm trying to install this module for GPU Direct RDMA. But the error occurs when I install sudo dpkg -i nvidia-peer-memory-dkms_1.0-1_all.deb

Unpacking nvidia-peer-memory-dkms (1.0-1) over (1.0-1) ...
Setting up nvidia-peer-memory-dkms (1.0-1) ...

Creating symlink /var/lib/dkms/nvidia-peer-memory/1.0/source ->
                 /usr/src/nvidia-peer-memory-1.0

DKMS: add completed.

Kernel preparation unnecessary for this kernel.  Skipping...

Building module:
cleaning build area....
make KERNELRELEASE=3.13.0-98-generic all && make DESTDIR=/var/lib/dkms/nv_peer_mem/1.0/build install....
cleaning build area....

DKMS: build completed.

nv_peer_mem:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.13.0-98-generic/updates/dkms/

depmod....

DKMS: install completed.
modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument

Do you know what's the problems?
Thanks!

nv_dma_unmap() is not protected by nv_mem_context->is_callback

in nv_dma_unmap(), if NV_DMA_MAPPING is defined, nvidia_p2p_dma_unmap_pages is called, which frees dma_mapping, even without setting nv_mem_context->sg_allocated.

later in nv_mem_put_pages(), sg_free_table() is called if nv_mem_context->sg_allocated!=0, even if NV_DMA_MAPPING is defined, which is incorrect.

Errors when install nvidia_peer_memory-1.0-8.x86_64.rpm

Hi:
l have met an error when l install nv_peer_mem on a node.
nvidia_peer_memory-1.0-8.x86_64.rpm
[centos@gpu x86_64]$ sudo rpm -ivh nvidia_peer_memory-1.0-8.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1:nvidia_peer_memory-1.0-8 ################################# [100%] depmod: ERROR: fstatat(4, nvidia-uvm.ko.xz): No such file or directory depmod: ERROR: fstatat(4, nvidia.ko.xz): No such file or directory depmod: ERROR: fstatat(4, nvidia-modeset.ko.xz): No such file or directory
This is the newest version of nv_peer_mem, my kernel version is 3.10.0-957.27.2.el7.x86_64.
After that, l have tried with older version of nv_peer_mem in another node, it successed, its kernel version is 3.10.0-957.12.2.el7.x86_64.
All two nodes are installed with cuda 10.1.

GPUDirect RDMA is not working inside the horovod-docker

hi all, I am running TensorFlow benchmarks inside the horovod-docker to evaluate the models in distributed mode. I have installed Mellanox driver and GPUDirect RDMA API, and loaded the GPUDirect kernel module on each server; also I have checked its status to make sure GPUDirect RDMA is active and I realized it is not recognized inside horovod docker, see below:

Outside the docker:
service nv_peer_mem status
Output
● nv_peer_mem.service - LSB: Activates/Deactivates nv_peer_mem to \ start at boot time.
Loaded: loaded (/etc/init.d/nv_peer_mem; bad; vendor preset: enabled)
Active: active (exited) since Thu 2018-06-07 16:02:45 CDT; 16h ago
Docs: man:systemd-sysv-generator(8)
Process: 303965 ExecStart=/etc/init.d/nv_peer_mem start (code=exited, status=0/SUCCESS)
Tasks: 0
Memory: 0B
CPU: 0

Jun 07 16:02:45 C4140-V100-1 systemd[1]: Starting LSB: Activates/Deactivates nv_peer_mem to \ start at boot time....
Jun 07 16:02:45 C4140-V100-1 nv_peer_mem[303965]: starting... OK

Inside the docker:
service nv_peer_mem status
Output:
nv_peer_mem: unrecognized service

Also, when I run the benchmarks inside the docker, the scaling efficiency drops from ~90% to ~77%. The systems releases this warning:
host-1-V100:24:203 [0] misc/ibvwrap.cu:61 WARN Failed to open libibverbs.so[.1]
host-1-V100:24:203 [0] INFO Using internal Network Socket

Can you help to find out how to fix it? also what are the mpirun flags to enable rmda (infiniband) and be sure the network communication is over rmda (infiniband) instead of the socket?

Makefile is not robust against nv-p2p.h not being present

NVIDIA ?= $(shell (find /usr/src/nvidia-* -name "nv-p2p.h"|xargs dirname))

if nv-p2p.h is not present, this line returns an obscure error:

make[1]: Entering directory /usr/src/kernels/2.6.32-642.el6.x86_64' dirname: missing operand Try dirname --help' for more information.

Have problem installing

Hi, I am using Ubuntu 16.04. From the installation instruction:

./build_module.sh
 cd /tmp
 tar xzf /tmp/nvidia-peer-memory_1.0.orig.tar.gz
 cd nvidia-peer-memory-1.0
 dpkg-buildpackage -us -uc
 dpkg -i <path to generated deb files>

example is given:

(e.g. dpkg -i nv-peer-memory_1.0-6_all.deb
      dpkg -i nv-peer-memory-dkms_1.0-6_all.deb)

However, I didn't find I create any nv-peer-memory_1.0-6_all.deb file during the building process. I am also very confused by the nv-peer-memory_1.0-6_all.deb and nvidia-peer-memory_1.0-6_all.deb. Are they the same? nvidia-peer-memory_1.0-5_all.deb can also work?

Thanks!

Yuxin

Failed to build nv_peer_mem on ubuntu 20.04

Project cloned from master: a5cbf19
Compilation fails on modpost

root@13481535799f:/var/lib/dkms/nvidia-peer-memory/1.0/build# make
/var/lib/dkms/nvidia-peer-memory/1.0/build/create_nv.symvers.sh 5.4.0-29-generic
Getting symbol versions from /lib/modules/5.4.0-29-generic/updates/dkms/nvidia.ko ...
Created: /var/lib/dkms/nvidia-peer-memory/1.0/build/nv.symvers
Found /usr/src/nvidia-440.64/nvidia/nv-p2p.h
/bin/cp -f /usr/src/nvidia-440.64/nvidia/nv-p2p.h /var/lib/dkms/nvidia-peer-memory/1.0/build/nv-p2p.h
cp -rf /usr/src/ofa_kernel/5.4.0-29-generic/Module.symvers .
cat nv.symvers >> Module.symvers
make -C /lib/modules/5.4.0-29-generic/build  M=/var/lib/dkms/nvidia-peer-memory/1.0/build modules
make[1]: Entering directory '/usr/src/linux-headers-5.4.0-29-generic'
  CC [M]  /var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.o
/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support
   80 | #pragma message("Enable nvidia_p2p_dma_map_pages support")
      |         ^~~~~~~
  Building modules, stage 2.
  MODPOST 1 modules
FATAL: parse error in symbol dump file
make[2]: *** [scripts/Makefile.modpost:94: __modpost] Error 1
make[1]: *** [Makefile:1632: modules] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-29-generic'
make: *** [Makefile:60: all] Error 2

The following fixed the issue, however im not sure how it would affect other distros

diff --git a/create_nv.symvers.sh b/create_nv.symvers.sh
index 453aa64..109f24d 100755
--- a/create_nv.symvers.sh
+++ b/create_nv.symvers.sh
@@ -118,7 +118,7 @@ do
                file=$(echo $line | cut -f1 -d: | sed -r -e 's@\./@@' -e '[email protected](\S)*@@' -e "s@$PWD/@@")
                crc=$(echo $line | cut -f2 -d: | cut -f1 -d" ")
                sym=$(echo $line | cut -f2 -d: | cut -f3 -d" " | sed -e 's/__crc_//g')
-               echo -e "0x$crc\t$sym\t$file" >> $MOD_SYMVERS
+               echo -e "0x$crc\t$sym\t$file\tEXPORT_SYMBOL\t" >> $MOD_SYMVERS
        done < <(nm -o $nvidia_mod | grep -E "$modules_pat")

        echo "Created: ${MOD_SYMVERS}"

invalidation callback relies on undefined behavior of NV driver

in nv_get_p2p_free_callback(), a comment shows:
141 /* For now don't set nv_mem_context->page_table to NULL,
142 * confirmed by NVIDIA that inflight put_pages with valid pointer will fail gracefully.
143 */
144

this is actually a bug in umem.c:peer_umem_release() which can call peer_mem dma_unmap() and put_pages() even if an invalidation callback has freed both the P2P dma mappings and the page_table.

That might have accidentally worked in the past, but it is not correct anymore.

Unknown symbol error in `dpkg -i /tmp/nvidia-peer-memory-dkms_1.0-8_all.deb`

Environment

System: Ubuntu 16.04
CUDA version: 10.0
Mellanox ofed: 4.6-1.0.1

$ uname  -r
4.4.0-131-generic
$ ls -l /lib/modules
total 16
drwxr-xr-x 7 root root 4096 Nov 26 20:02 4.4.0-131-generic
drwxr-xr-x 3 root root 4096 Jul 31 22:32 4.4.0-21-generic
drwxr-xr-x 3 root root 4096 Jul 31 22:32 4.4.0-64-generic
drwxr-xr-x 3 root root 4096 Jul 31 22:33 4.4.0-66-generic
$ ls -l /usr/src/ofa_kernel/
total 4
drwxr-xr-x 7 root root 4096 Aug  2 02:46 4.4.0-131-generic
lrwxrwxrwx 1 root root   17 Aug  2 02:46 default -> 4.4.0-131-generic

Description

Hi. I tried to install nv_peer_memory. I ran the following commands:

./build_module.sh
cd /tmp
tar xzf /tmp/nvidia-peer-memory_1.0.orig.tar.gz
cd nvidia-peer-memory-1.0
dpkg-buildpackage -us -uc
dpkg -i /tmp/nvidia-peer-memory_1.0-8_all.deb
dpkg -i /tmp/nvidia-peer-memory-dkms_1.0-8_all.deb

It failed when tried to install dkms deb. The full build log is:

$ dpkg -i /tmp/nvidia-peer-memory-dkms_1.0-8_all.deb
(Reading database ... 133469 files and directories currently installed.)
Preparing to unpack .../nvidia-peer-memory-dkms_1.0-8_all.deb ...

------------------------------
Deleting module version: 1.0
completely from the DKMS tree.
------------------------------
Done.
Unpacking nvidia-peer-memory-dkms (1.0-8) over (1.0-8) ...
Setting up nvidia-peer-memory-dkms (1.0-8) ...
Loading new nvidia-peer-memory-1.0 DKMS files...
Building only for 4.4.0-131-generic
Building initial module for 4.4.0-131-generic
Secure Boot not enabled on this system.
Done.

nv_peer_mem:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/4.4.0-131-generic/updates/dkms/

depmod....

DKMS: install completed.
modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument
dpkg: error processing package nvidia-peer-memory-dkms (--install):
 subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
 nvidia-peer-memory-dkms

The dmesg errors are:

$ dmesg | grep nv_peer_mem
[1624474.366292] nv_peer_mem: Unknown symbol nvidia_p2p_dma_map_pages (err -22)
[1624474.366314] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_dma_mapping
[1624474.366316] nv_peer_mem: Unknown symbol nvidia_p2p_free_dma_mapping (err -22)
[1624474.366338] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_page_table
[1624474.366340] nv_peer_mem: Unknown symbol nvidia_p2p_free_page_table (err -22)
[1633847.270244] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_unmap_pages
[1633847.270249] nv_peer_mem: Unknown symbol nvidia_p2p_dma_unmap_pages (err -22)
[1633847.270275] nv_peer_mem: disagrees about version of symbol nvidia_p2p_get_pages
[1633847.270277] nv_peer_mem: Unknown symbol nvidia_p2p_get_pages (err -22)
[1633847.270296] nv_peer_mem: disagrees about version of symbol nvidia_p2p_put_pages
[1633847.270298] nv_peer_mem: Unknown symbol nvidia_p2p_put_pages (err -22)
[1633847.270347] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_map_pages
[1633847.270349] nv_peer_mem: Unknown symbol nvidia_p2p_dma_map_pages (err -22)
[1633847.270367] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_dma_mapping
[1633847.270369] nv_peer_mem: Unknown symbol nvidia_p2p_free_dma_mapping (err -22)
[1633847.270386] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_page_table
[1633847.270388] nv_peer_mem: Unknown symbol nvidia_p2p_free_page_table (err -22)

I checked the following similar issues but found they are not the source of the problem.

  • Kernel mismatching. I think in my case I was using the same kernel 4.4.0-131-generic to compile and install. And I think this problem has been fixed.
  • Wrong kernel module name. I ran make in the /tmp dir and it built nv_peer_mem.ko. So I think it is not the problem.

Ubuntu deb missing dependency on mlnx-ofed-kernel

The problem we are seeing on Ubuntu is that after a kernel + MLNX OFED upgrade, DKMS could try to build nv_peer_mem before ofa_kernel, so /var/lib/dkms/mlnx-ofed-kernel/3.4/build/Module.symvers file is not present yet:

$ cat /var/lib/dkms/nvidia-peer-memory/1.1/build/make.log
DKMS make.log for nvidia-peer-memory-1.1 for kernel 4.2.0-27-generic (x86_64)
Tue Nov 22 10:47:32 PST 2016
cp -rf /Module.symvers .
cp: cannot stat ‘/Module.symvers’: No such file or directory

upgrading kernel with nv_peer_mem-1.0-5 breaks on Ubuntu 14.04.5

Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 4.4.0-116-generic x86_64)
...
[76463.688316] nv_peer_mem: disagrees about version of symbol ib_register_peer_memory_client
[76463.688326] nv_peer_mem: Unknown symbol ib_register_peer_memory_client (err -22)

this is from nvidia-peer-memory_1.0.5.tar.gz

the problem seems to be that nv_peer_mem/Makefile picks the ofa_kernel symbols from /usr/src/ofa_kernel/default instead of /usr/src/ofa_kernel/$(uname -r).
for some reason, default is not

$ uname -r
4.4.0-116-generic
$ ls -l //usr/src/ofa_kernel/
total 8
drwxr-xr-x 7 root root 4096 Apr 1 17:35 4.4.0-116-generic
drwxr-xr-x 7 root root 4096 Dec 8 18:04 4.4.0-97-generic
lrwxrwxrwx 1 root root 16 Dec 8 18:04 default -> 4.4.0-97-generic
$ ls -l //usr/src/ofa_kernel/default
lrwxrwxrwx 1 root root 16 Dec 8 18:04 //usr/src/ofa_kernel/default -> 4.4.0-97-generic

Ubuntu 18.04 failure

building the latest repo version on Ubuntu 18.04 fails in a subtle way:

<...>/nv_peer_memory/create_nv.symvers.sh 4.15.0-20-generic
-W- Could not get list of nvidia symbols.
Found /usr/src/nvidia-410.09//nvidia/nv-p2p.h
/bin/cp -f /usr/src/nvidia-410.09//nvidia/nv-p2p.h /home/lab/IB/nv_peer_memory/nv-p2p.h
cp -rf /usr/src/ofa_kernel/4.15.0-20-generic/Module.symvers .
cat nv.symvers >> Module.symvers
make -C /lib/modules/4.15.0-20-generic/build  M=/home/lab/IB/nv_peer_memory modules
make[1]: Entering directory '/usr/src/linux-headers-4.15.0-20-generic'
  CC [M]  /home/lab/IB/nv_peer_memory/nv_peer_mem.o
/home/lab/IB/nv_peer_memory/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support
 #pragma message("Enable nvidia_p2p_dma_map_pages support")
         ^~~~~~~
  Building modules, stage 2.
  MODPOST 1 modules
WARNING: "nvidia_p2p_dma_map_pages" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_dma_unmap_pages" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_free_page_table" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_free_dma_mapping" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_get_pages" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_put_pages" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
  LD [M]  /home/lab/IB/nv_peer_memory/nv_peer_mem.ko
make[1]: Leaving directory '/usr/src/linux-headers-4.15.0-20-generic'

it seems to be related to the kernel not being built with modversions enabled, e.g. in /boot/config-4.15.0-20-generic:

# CONFIG_MODVERSIONS is not set
CONFIG_MODULE_SRCVERSION_ALL=y

It is not clear whether this fatal or not, though we are observing run-time errors when trying to send GPU memory.

When installing the official 1.0-3 release via debian package, the source in the official tar file is not used.

When using build_release.sh to build a debian package, it does not use the source in the nvidia-peer-memory-1.0-3.tar.gz file, but rather uses git to check out a newer version of the source from here. This causes a few problems:
1) the apparent source does not necessarily match the package.
2) since changes have been made since the changelog file was checked in the packages
nvidia-peer-memory-dkms_1.0-3_all.deb
nvidia-peer-memory_1.0-3_all.deb
could be built from any one of the following commits:
d93f07d Merge pull request #16 from Mellanox/nv_dma_mapping_fix
0a39a45 Update nv_peer_mem.c
d695cb6 Add missing free dma mapping call
fa142d7 Clear sg_allocated value aftersg page table free
5029c0d Protect nv_dma_unmap() from p2p_free_callback
f4e172f Temporary commit to disable nvidia_p2p_dma_map_pages
a24c818 Roll 1.0-3 release

nv_peer_mem dissapearing after a reboot

I'm running on Ubuntu 18.04 (bionic) (kernel: 4.15.0-46-generic), installed both MLNX_OFED and nv_peer_mem but when I reboot the system nv_peer_mem isn't found in lsmod but does show up with modinfo. I have to keep installing nv_peer_mem back on the system.
Does anyone know how to fix this issue?

lsmod | grep nv_peer

modinfo nv_peer_mem

filename: /lib/modules/4.15.0-46-generic/updates/dkms/nv_peer_mem.ko
version: 1.0-8
license: Dual BSD/GPL
description: NVIDIA GPU memory plug-in
author: Yishai Hadas
srcversion: 9F372F055FA43FA7440C57D
depends: ib_core
retpoline: Y
name: nv_peer_mem
vermagic: 4.15.0-46-generic SMP mod_unload
signat: PKCS#7
signer:
sig_key:
sig_hashalgo: md4

unable to install nvidia-peer-memory-dkms_1.1-0_all.deb

I am getting the following error in the last step from the instruction. What is wrong? I haven't figured out why.
See below

dpkg -i ../nvidia-peer-memory-dkms_1.1-0_all.deb

Selecting previously unselected package nvidia-peer-memory-dkms.
(Reading database ... 249155 files and directories currently installed.)
Preparing to unpack .../nvidia-peer-memory-dkms_1.1-0_all.deb ...
Unpacking nvidia-peer-memory-dkms (1.1-0) ...
Setting up nvidia-peer-memory-dkms (1.1-0) ...
Loading new nvidia-peer-memory-1.1 DKMS files...
Building for 5.4.0-56-generic
Building initial module for 5.4.0-56-generic
Secure Boot not enabled on this system.
Done.

nv_peer_mem:
Running module version sanity check.

  • Original module
    • No original module exists within this kernel
  • Installation
    • Installing to /lib/modules/5.4.0-56-generic/updates/dkms/

depmod...

DKMS: install completed.
modprobe: ERROR: could not insert 'nv_peer_mem': Unknown symbol in module, or unknown parameter (see dmesg)
dpkg: error processing package nvidia-peer-memory-dkms (--install):
installed nvidia-peer-memory-dkms package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
nvidia-peer-memory-dkms

Why it doesn't show connection via NET/IB/0/GDRDMA

Environment:

  1. Framework: TensorFlow
  2. Framework version: TF 1.4
  3. Horovod version: 0.18.2 via Horovod in docker
  4. MPI version: 4.0.0
  5. CUDA version: 10.0
  6. NCCL version: .4.7-1
  7. Python version: 2.7
  8. OS and version: Ubuntu 18.06
  9. GCC version: 4.8
  10. Mellanox OFED 4.7.1
  11. GPUDirect RDMA - nvidia-peer-memory_1.0-8

Your question:
I am running the TF benchmarks in multi-node mode with the latest version of Horovod via docker but I am not seeing the output connection via NET/IB/0/GDRDMA , see below the trace log

Tracelog
master_node:20:289 [0] NCCL INFO NET/Socket : Using [0]ib0:192.168.11.1<0>
master_node:20:289 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
master_node:20:289 [0] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
master_node:20:289 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB ; OOB ib0:192.168.11.1<0>
NCCL version 2.4.7+cuda10.0
master_node:22:295 [2] NCCL INFO NET/Socket : Using [0]ib0:192.168.11.1<0>
master_node:22:295 [2] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
master_node:21:290 [1] NCCL INFO NET/Socket : Using [0]ib0:192.168.11.1<0>
master_node:23:288 [3] NCCL INFO NET/Socket : Using [0]ib0:192.168.11.1<0>
master_node:21:290 [1] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
master_node:23:288 [3] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
master_node:22:295 [2] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
master_node:21:290 [1] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
master_node:23:288 [3] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
secondary_node:44:311 [3] NCCL INFO NET/Socket : Using [0]ib0:192.168.11.2<0>
secondary_node:41:312 [0] NCCL INFO NET/Socket : Using [0]ib0:192.168.11.2<0>
secondary_node:42:310 [1] NCCL INFO NET/Socket : Using [0]ib0:192.168.11.2<0>
secondary_node:43:309 [2] NCCL INFO NET/Socket : Using [0]ib0:192.168.11.2<0>
secondary_node:42:310 [1] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
secondary_node:43:309 [2] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
secondary_node:44:311 [3] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
secondary_node:41:312 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so).
secondary_node:43:309 [2] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
secondary_node:44:311 [3] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
secondary_node:42:310 [1] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
secondary_node:41:312 [0] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
master_node:22:295 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB ; OOB ib0:192.168.11.1<0>
master_node:23:288 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB ; OOB ib0:192.168.11.1<0>
master_node:21:290 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB ; OOB ib0:192.168.11.1<0>
secondary_node:43:309 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB ; OOB ib0:192.168.11.2<0>
secondary_node:44:311 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB ; OOB ib0:192.168.11.2<0>
secondary_node:41:312 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB ; OOB ib0:192.168.11.2<0>
secondary_node:42:310 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB ; OOB ib0:192.168.11.2<0>
master_node:20:289 [0] NCCL INFO Setting affinity for GPU 0 to 5555,55555555,55555555
master_node:23:288 [3] NCCL INFO Setting affinity for GPU 3 to aaaa,aaaaaaaa,aaaaaaaa
master_node:21:290 [1] NCCL INFO Setting affinity for GPU 1 to 5555,55555555,55555555
master_node:22:295 [2] NCCL INFO Setting affinity for GPU 2 to aaaa,aaaaaaaa,aaaaaaaa
secondary_node:44:311 [3] NCCL INFO Setting affinity for GPU 3 to aaaa,aaaaaaaa,aaaaaaaa
secondary_node:43:309 [2] NCCL INFO Setting affinity for GPU 2 to aaaa,aaaaaaaa,aaaaaaaa
secondary_node:41:312 [0] NCCL INFO Setting affinity for GPU 0 to 5555,55555555,55555555
secondary_node:42:310 [1] NCCL INFO Setting affinity for GPU 1 to 5555,55555555,55555555
secondary_node:41:312 [0] NCCL INFO CUDA Dev 0[0], IB NIC distance : SYS
secondary_node:44:311 [3] NCCL INFO CUDA Dev 3[3], IB NIC distance : NODE
secondary_node:42:310 [1] NCCL INFO CUDA Dev 1[1], IB NIC distance : SYS
secondary_node:43:309 [2] NCCL INFO CUDA Dev 2[2], IB NIC distance : NODE
master_node:22:295 [2] NCCL INFO CUDA Dev 2[2], IB NIC distance : NODE
master_node:23:288 [3] NCCL INFO CUDA Dev 3[3], IB NIC distance : NODE
master_node:21:290 [1] NCCL INFO CUDA Dev 1[1], IB NIC distance : SYS
master_node:20:289 [0] NCCL INFO CUDA Dev 0[0], IB NIC distance : SYS
master_node:20:289 [0] NCCL INFO Channel 00 : 0 1 3 6 4 5 7 2
master_node:20:289 [0] NCCL INFO Channel 01 : 0 1 3 6 4 5 7 2
master_node:22:295 [2] NCCL INFO Ring 00 : 7 -> 2 [receive] via NET/IB/0
master_node:22:295 [2] NCCL INFO Ring 00 : 2[2] -> 0[0] via P2P/IPC
secondary_node:43:309 [2] NCCL INFO Ring 00 : 3 -> 6 [receive] via NET/IB/0
master_node:21:290 [1] NCCL INFO Ring 00 : 1[1] -> 3[3] via P2P/IPC
master_node:20:289 [0] NCCL INFO Ring 00 : 0[0] -> 1[1] via P2P/IPC
master_node:23:288 [3] NCCL INFO Ring 00 : 3 -> 6 [send] via NET/IB/0
master_node:23:288 [3] NCCL INFO Ring 00 : 3[3] -> 1[1] via P2P/IPC
secondary_node:43:309 [2] NCCL INFO Ring 00 : 6[2] -> 4[0] via P2P/IPC
master_node:21:290 [1] NCCL INFO Ring 00 : 1[1] -> 0[0] via P2P/IPC
master_node:20:289 [0] NCCL INFO Ring 00 : 0[0] -> 2[2] via P2P/IPC
master_node:21:290 [1] NCCL INFO Ring 01 : 1[1] -> 3[3] via P2P/IPC
master_node:23:288 [3] NCCL INFO Ring 01 : 3 -> 6 [send] via NET/IB/0
secondary_node:42:310 [1] NCCL INFO Ring 00 : 5[1] -> 7[3] via P2P/IPC
secondary_node:41:312 [0] NCCL INFO Ring 00 : 4[0] -> 5[1] via P2P/IPC
secondary_node:44:311 [3] NCCL INFO Ring 00 : 7 -> 2 [send] via NET/IB/0
master_node:22:295 [2] NCCL INFO Ring 00 : 6 -> 2 [receive] via NET/IB/0
master_node:20:289 [0] NCCL INFO Ring 01 : 0[0] -> 1[1] via P2P/IPC
master_node:21:290 [1] NCCL INFO Ring 01 : 1[1] -> 0[0] via P2P/IPC
secondary_node:44:311 [3] NCCL INFO Ring 00 : 7[3] -> 5[1] via P2P/IPC
secondary_node:43:309 [2] NCCL INFO Ring 00 : 6 -> 2 [send] via NET/IB/0
secondary_node:42:310 [1] NCCL INFO Ring 00 : 5[1] -> 4[0] via P2P/IPC
secondary_node:41:312 [0] NCCL INFO Ring 00 : 4[0] -> 6[2] via P2P/IPC
secondary_node:43:309 [2] NCCL INFO Ring 00 : 2 -> 6 [receive] via NET/IB/0
master_node:22:295 [2] NCCL INFO Ring 00 : 2 -> 6 [send] via NET/IB/0
master_node:22:295 [2] NCCL INFO Ring 01 : 7 -> 2 [receive] via NET/IB/0
master_node:22:295 [2] NCCL INFO Ring 01 : 2[2] -> 0[0] via P2P/IPC
secondary_node:43:309 [2] NCCL INFO Ring 01 : 3 -> 6 [receive] via NET/IB/0
master_node:23:288 [3] NCCL INFO Ring 01 : 3[3] -> 1[1] via P2P/IPC
master_node:21:290 [1] NCCL INFO Trees [0] 0->1->3/-1/-1 [1] 0->1->3/-1/-1
secondary_node:44:311 [3] NCCL INFO Ring 01 : 7 -> 2 [send] via NET/IB/0
master_node:23:288 [3] NCCL INFO Trees [0] 1->3->-1/-1/-1 [1] 1->3->-1/-1/-1
master_node:20:289 [0] NCCL INFO Ring 01 : 0[0] -> 2[2] via P2P/IPC
secondary_node:43:309 [2] NCCL INFO Ring 01 : 6[2] -> 4[0] via P2P/IPC
master_node:21:290 [1] NCCL INFO comm 0x7f4d6839f060 rank 1 nranks 8 cudaDev 1 nvmlDev 1 - Init COMPLETE
master_node:23:288 [3] NCCL INFO comm 0x7f48503a3650 rank 3 nranks 8 cudaDev 3 nvmlDev 3 - Init COMPLETE
master_node:20:289 [0] NCCL INFO Trees [0] 2->0->1/-1/-1 [1] 2->0->1/-1/-1
master_node:20:289 [0] NCCL INFO Using 256 threads, Min Comp Cap 7, Trees enabled for all sizes
secondary_node:42:310 [1] NCCL INFO Ring 01 : 5[1] -> 7[3] via P2P/IPC
secondary_node:41:312 [0] NCCL INFO Ring 01 : 4[0] -> 5[1] via P2P/IPC
master_node:20:289 [0] NCCL INFO comm 0x7f5450362840 rank 0 nranks 8 cudaDev 0 nvmlDev 0 - Init COMPLETE
master_node:22:295 [2] NCCL INFO Ring 01 : 2 -> 6 [send] via NET/IB/0
secondary_node:44:311 [3] NCCL INFO Ring 01 : 7[3] -> 5[1] via P2P/IPC
secondary_node:43:309 [2] NCCL INFO Ring 01 : 2 -> 6 [receive] via NET/IB/0
secondary_node:44:311 [3] NCCL INFO Trees [0] 5->7->-1/-1/-1 [1] 5->7->-1/-1/-1
master_node:22:295 [2] NCCL INFO Ring 01 : 6 -> 2 [receive] via NET/IB/0
secondary_node:42:310 [1] NCCL INFO Ring 01 : 5[1] -> 4[0] via P2P/IPC
secondary_node:41:312 [0] NCCL INFO Ring 01 : 4[0] -> 6[2] via P2P/IPC
secondary_node:44:311 [3] NCCL INFO comm 0x7ff2c43f7c00 rank 7 nranks 8 cudaDev 3 nvmlDev 3 - Init COMPLETE
secondary_node:42:310 [1] NCCL INFO Trees [0] 4->5->7/-1/-1 [1] 4->5->7/-1/-1
secondary_node:41:312 [0] NCCL INFO Trees [0] 6->4->5/-1/-1 [1] 6->4->5/-1/-1
secondary_node:41:312 [0] NCCL INFO comm 0x7fd8dc3c6740 rank 4 nranks 8 cudaDev 0 nvmlDev 0 - Init COMPLETE
secondary_node:43:309 [2] NCCL INFO Ring 01 : 6 -> 2 [send] via NET/IB/0
secondary_node:43:309 [2] NCCL INFO Trees [0] 2->6->4/-1/-1 [1] -1->6->4/2/-1
secondary_node:42:310 [1] NCCL INFO comm 0x7fa7cc422c90 rank 5 nranks 8 cudaDev 1 nvmlDev 1 - Init COMPLETE
secondary_node:43:309 [2] NCCL INFO comm 0x7fce9c438c90 rank 6 nranks 8 cudaDev 2 nvmlDev 2 - Init COMPLETE
master_node:22:295 [2] NCCL INFO Trees [0] -1->2->0/6/-1 [1] 6->2->0/-1/-1
master_node:22:295 [2] NCCL INFO comm 0x7fd8f038f460 rank 2 nranks 8 cudaDev 2 nvmlDev 2 - Init COMPLETE
master_node:20:289 [0] NCCL INFO Launch mode Parallel

Failed to reg big GPU mem

Hi,
we are running a PoC that tried to pin GPU memory, but failed with error msg below.
it works if reg small GPU mem, such as a few MB, but failed if size >128MB;
meanwhile, no problem when reg CPU memory at very large size;

could u pls help to check what's cause? thanks.

[2017/07/20-04:47:36.231116] xio_rdma_verbs.c:248 [ERROR] - ibv_reg_mr failed, Bad address. addr:0x7f21e3369dc0, length:15863892992, access:0x7
dmesg shows error in:
[79199.665302] ib_umem_get: failed to get user pages, nr_pages=512
[79199.669653] mlx5_0:mr_umem_get:709:(pid 15855): umem get failed (-131668346275144)

Relevant system info:
Mellanox Technologies MT28800 Family ConnectX-5, firmware version: 16.20.1010
256GB system memory (and lots of free mem at that moment)
ubuntu 16.04, 4.4.0-83-generic
MLNX_OFED_LINUX-4.1-1.0.2.0 (OFED-4.1-1.0.2)
CUDA 8.0, 375.66
latest NV_peer_memm (checkout on July 2017)
Tried Nvidia P100 (PCIe) and K80 GPU,

we already follow some practices here:
https://community.mellanox.com/docs/DOC-1120
http://www.rdmamojo.com/2012/09/07/ibv_reg_mr/

root@B4130:/tmp# ulimit -l
unlimited
root@B4130:
/tmp# cat /sys/module/mlx4_core/parameters/log_num_mtt
24

insmod: ERROR: could not insert module nv_peer_mem.ko: Invalid parameters in centOS7.5

[dgx@dhcp-10-19-192-252 Jerry]$ cd nv_peer_memory_1.08/
[dgx@dhcp-10-19-192-252 nv_peer_memory_1.08]$ make
[dgx@dhcp-10-19-192-252 nv_peer_memory_1.08]$ sudo insmod nv_peer_mem.ko
[sudo] password for dgx:
insmod: ERROR: could not insert module nv_peer_mem.ko: Invalid parameters

[dgx@dhcp-10-19-192-252 nv_peer_memory_1.08]$ uname -r && cat /etc/*release
3.10.0-862.el7.x86_64
CentOS Linux release 7.5.1804 (Core)
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

CentOS Linux release 7.5.1804 (Core)
CentOS Linux release 7.5.1804 (Core)
[dgx@dhcp-10-19-192-252 nv_peer_memory_1.08]$ gcc --version
gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[dgx@dhcp-10-19-192-252 nv_peer_memory_1.08]$ lspci |grep mellanox -i
3e:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
3e:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]

[dgx@dhcp-10-19-192-252 nv_peer_memory_1.08]$ ofed_info|head -1
MLNX_OFED_LINUX-4.7-1.0.0.1 (OFED-4.7-1.0.0):

[dgx@dhcp-10-19-192-252 nv_peer_memory_1.08]$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

[dgx@dhcp-10-19-192-252 nv_peer_memory_1.08]$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 418.87.01 Wed Sep 25 06:00:38 UTC 2019

install failed after kernel upgrade

after I upgrade kernel from h142 to h193

[nvidia-peer-memory-1.0]# ./build_module.sh 

Building source rpm for nvidia_peer_memory...

Built: /tmp/nvidia_peer_memory-1.0-7.src.rpm

To install run on RPM based OS:
    # rpmbuild --rebuild /tmp/nvidia_peer_memory-1.0-7.src.rpm
    # rpm -ivh <path to generated binary rpm file>



[nvidia-peer-memory-1.0]# rpmbuild --rebuild /tmp/nvidia_peer_memory-1.0-7.src.rpm
Installing /tmp/nvidia_peer_memory-1.0-7.src.rpm
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.fdt3MB
+ umask 022
+ cd /root/rpmbuild/BUILD
+ cd /root/rpmbuild/BUILD
+ rm -rf nvidia_peer_memory-1.0
+ /usr/bin/gzip -dc /root/rpmbuild/SOURCES/nvidia_peer_memory-1.0.tar.gz
+ /usr/bin/tar -xvvf -
drwx------ root/root         0 2019-07-22 15:49 nvidia_peer_memory-1.0/
-rw------- root/root      5817 2019-07-22 15:49 nvidia_peer_memory-1.0/compat_nv-p2p.h
drwx------ root/root         0 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/
drwx------ root/root         0 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/source/
-rw------- root/root        12 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/source/format
-rwx------ root/root       199 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/nvidia-peer-memory.prerm
-rwx------ root/root       231 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/nvidia-peer-memory-dkms.prerm
-rw------- root/root         2 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/compat
-rw------- root/root      1613 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/changelog
-rw------- root/root       912 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/control
-rwx------ root/root      1362 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/rules
-rwx------ root/root       506 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/nvidia-peer-memory-dkms.postinst
-rwx------ root/root       431 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/updateInit.sh
-rwx------ root/root       198 2019-07-22 15:49 nvidia_peer_memory-1.0/debian/nvidia-peer-memory.postinst
-rw------- root/root       614 2019-07-22 15:49 nvidia_peer_memory-1.0/dkms.conf
-rw------- root/root        47 2019-07-22 15:49 nvidia_peer_memory-1.0/nv_peer_mem.conf
-rwx------ root/root      2276 2019-07-22 15:49 nvidia_peer_memory-1.0/build_module.sh
-rwx------ root/root     13013 2019-07-22 15:49 nvidia_peer_memory-1.0/nv_peer_mem.c
-rw------- root/root      3415 2019-07-22 15:49 nvidia_peer_memory-1.0/README.md
-rwx------ root/root      3765 2019-07-22 15:49 nvidia_peer_memory-1.0/create_nv.symvers.sh
-rwx------ root/root       241 2019-07-22 15:49 nvidia_peer_memory-1.0/nv_peer_mem.upstart
-rw------- root/root      3299 2019-07-22 15:49 nvidia_peer_memory-1.0/nvidia_peer_memory.spec
-rw------- root/root      3707 2019-07-22 15:49 nvidia_peer_memory-1.0/Makefile
-rwx------ root/root      2756 2019-07-22 15:49 nvidia_peer_memory-1.0/nv_peer_mem
+ STATUS=0
+ '[' 0 -ne 0 ']'
+ cd nvidia_peer_memory-1.0
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
+ exit 0
Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.yPvOeO
+ umask 022
+ cd /root/rpmbuild/BUILD
+ cd nvidia_peer_memory-1.0
+ export KVER=3.10.0-514.44.5.10.h193.x86_64
+ KVER=3.10.0-514.44.5.10.h193.x86_64
+ make KVER=3.10.0-514.44.5.10.h193.x86_64 all
/root/rpmbuild/BUILD/nvidia_peer_memory-1.0/create_nv.symvers.sh 3.10.0-514.44.5.10.h193.x86_64
Getting symbol versions from /lib/modules/3.10.0-514.44.5.10.h193.x86_64/kernel/drivers/video/nvidia.ko ...
Created: /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv.symvers
Found /usr/src/nvidia-418.39//nvidia/nv-p2p.h
/bin/cp -f /usr/src/nvidia-418.39//nvidia/nv-p2p.h /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv-p2p.h
cp -rf /usr/src/ofa_kernel/default/Module.symvers .
cat nv.symvers >> Module.symvers
make -C /lib/modules/3.10.0-514.44.5.10.h193.x86_64/build  M=/root/rpmbuild/BUILD/nvidia_peer_memory-1.0 modules
make[1]: Entering directory `/usr/src/kernels/3.10.0-514.44.5.10.h193.x86_64'
  CC [M]  /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.o
/root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support
 #pragma message("Enable nvidia_p2p_dma_map_pages support")
         ^
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.mod.o
  LD [M]  /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.ko
make[1]: Leaving directory `/usr/src/kernels/3.10.0-514.44.5.10.h193.x86_64'
+ exit 0
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.hdfOm7
+ umask 022
+ cd /root/rpmbuild/BUILD
+ '[' /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64 '!=' / ']'
+ rm -rf /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64
++ dirname /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64
+ mkdir -p /root/rpmbuild/BUILDROOT
+ mkdir /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64
+ cd nvidia_peer_memory-1.0
+ export KVER=3.10.0-514.44.5.10.h193.x86_64
+ KVER=3.10.0-514.44.5.10.h193.x86_64
+ make DESTDIR=/root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64 KVER=3.10.0-514.44.5.10.h193.x86_64 install
mkdir -p /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64//lib/modules/3.10.0-514.44.5.10.h193.x86_64/extra/;
cp -f /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.ko /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64//lib/modules/3.10.0-514.44.5.10.h193.x86_64/extra/;
if [ ! -n "/root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64" ]; then /sbin/depmod -r -ae 3.10.0-514.44.5.10.h193.x86_64;fi;
+ install -d /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64/etc/infiniband
+ install -m 0644 /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.conf /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64/etc/infiniband
+ install -d /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64/etc/init.d
+ install -m 0755 /root/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64/etc/init.d
+ /usr/lib/rpm/check-buildroot
+ /usr/lib/rpm/redhat/brp-compress
+ /usr/lib/rpm/redhat/brp-strip /usr/bin/strip
+ /usr/lib/rpm/redhat/brp-strip-comment-note /usr/bin/strip /usr/bin/objdump
+ /usr/lib/rpm/redhat/brp-strip-static-archive /usr/bin/strip
+ /usr/lib/rpm/brp-python-bytecompile /usr/bin/python 1
+ /usr/lib/rpm/redhat/brp-python-hardlink
+ /usr/lib/rpm/redhat/brp-java-repack-jars
Processing files: nvidia_peer_memory-1.0-7.x86_64
Provides: nvidia_peer_memory = 1.0-7 nvidia_peer_memory(x86-64) = 1.0-7
Requires(interp): /bin/sh /bin/sh
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Requires(post): /bin/sh
Requires(preun): /bin/sh
Requires: /bin/bash
Checking for unpackaged file(s): /usr/lib/rpm/check-files /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64
Wrote: /root/rpmbuild/RPMS/x86_64/nvidia_peer_memory-1.0-7.x86_64.rpm
Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.9VAnUK
+ umask 022
+ cd /root/rpmbuild/BUILD
+ cd nvidia_peer_memory-1.0
+ cd /tmp
+ chmod -R o+w /root/rpmbuild/BUILD/nvidia_peer_memory-1.0
+ rm -rf /root/rpmbuild/BUILD/nvidia_peer_memory-1.0
+ test x/root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64 '!=' x
+ rm -rf /root/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-7.x86_64
+ exit 0
Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.SmzrJ4
+ umask 022
+ cd /root/rpmbuild/BUILD
+ rm -rf nvidia_peer_memory-1.0
+ exit 0



[nvidia-peer-memory-1.0]# rpm -ivh /root/rpmbuild/RPMS/x86_64/nvidia_peer_memory-1.0-7.x86_64.rpm
Preparing...                          ################################# [100%]
	package nvidia_peer_memory-1.0-7.x86_64 is already installed

but ERROR: Module nv_peer_mem not found when status ERROR: Module nv_peer_mem not found

● nv_peer_mem.service - LSB: Activates/Deactivates nv_peer_mem module to start at boot time.
   Loaded: loaded (/etc/rc.d/init.d/nv_peer_mem; bad; vendor preset: disabled)
   Active: failed (Result: exit-code) since Mon 2019-07-22 15:42:28 CST; 10min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 29524 ExecStart=/etc/rc.d/init.d/nv_peer_mem start (code=exited, status=1/FAILURE)

Jul 22 15:42:28  systemd[1]: Starting LSB: Activates/Deactivates nv_peer_mem module to start at boot time....
Jul 22 15:42:28  nv_peer_mem[29524]: starting... modinfo: ERROR: Module nv_peer_mem not found.
Jul 22 15:42:28  nv_peer_mem[29524]: Module nv_peer_mem does not exist
Jul 22 15:42:28  nv_peer_mem[29524]: Failed to load nv_peer_mem
Jul 22 15:42:28  systemd[1]: nv_peer_mem.service: control process exited, code=exited status=1
Jul 22 15:42:28  systemd[1]: Failed to start LSB: Activates/Deactivates nv_peer_mem module to start at boot time..
Jul 22 15:42:28  systemd[1]: Unit nv_peer_mem.service entered failed state.
Jul 22 15:42:28  systemd[1]: nv_peer_mem.service failed.

and locate nv_peer_mem.ko still show the old kernel version path

[nvidia-peer-memory-1.0]# locate nv_peer_mem.ko
/usr/lib/modules/3.10.0-514.44.5.10.h142.x86_64/extra/nv_peer_mem.ko

concurrent invalidation and tear-down can trigger a bug

Condition below is benign (see #15 ) so peer_err() below is incorrect and confusing. It should be removed.

nvidia_p2p_dma_unmap_pages
{
...
#if NV_DMA_MAPPING
        if (!nv_mem_context->dma_mapping) {
                peer_err("nv_get_p2p_free_callback -- invalid dma_mapping\n");

nv_peer_mem dkms fails on ubuntu 18.04,

I'm not sure if this is an issue with nv_peer_mem or with the something else, but trying to build the nv_peer_mem module against the 4.15.0-10 kernel in Ubuntu bionic fails because ACCESS_ONCE is undefined. ACCESS_ONCE was removed from compiler.h in the kernel in december 2017.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.