Git Product home page Git Product logo

mpu's Introduction

MPU

A shim driver allows in-docker nvidia-smi showing correct process list without modify anything.

The problems

The NVIDIA driver is not aware of the PID namespace and nvidia-smi has no capability to map global pid to virtual pid, thus it shows nothing. What's more, The NVIDIA driver is proprietary and we have no idea what's going on inside even small part of the Linux NVIDIA driver is open sourced.

The alternatives

  • add 'hostPID: true' to the pod specification
  • add '--pid=host' when starting a docker instance

Installation

NOTE: kernel 5.7.7 build routines don't export kallsyms kernel functions any longer, which means this module may not work properly.

  • for debian, to get kernel headers installed with sudo apt install linux-headers-$(uname -r). run sudo apt-get install build-essential to get make toolset installed.
  • clone this repo
  • cd and make
  • after build succeeded, sudo make install to install the module
  • using docker to create --gpu enabled instance and run several cases and check process list via nvidia-smi to see if all associated processes have been correctly shown

The steps

  • figure out the basic mechanism of the NVIDIA driver with the open sourced part
  • do some reverse engineering tests on the driver via GDB tools and several scripts (cuda/NVML)
  • use our module to intercept syscalls and re-write fields of data strucuture with the knowledge of reverse engineering
  • run the nvidia-smi with our module with several test cases

The details

  • nvidia-smi requests 0x20 ioctl command with 0xee4 flag to getting the global PID list (under init_pid_ns)
  • after getting non-empty PID list, it'll request 0x20 ioctl command with 0x1f48 flag with previous returned pids as input arguments to getting the process GPU memory consumptions
  • we hook the syscalls in system-wide approaching and intercept only NVIDIA device ioctl syscall (device major number is 195 and minor is 255 (control dev) which is defined in NVIDIA header file)
  • check if request task is under any PID namespace, do nothing if it's global one (under init_pid_ns)
  • if so, convert the PID list from global to virtual
  • however, is a little more complicated which contains two-way interceptors--pre and post.
    • on pre-stage, before invoking NVIDIA ioctl, the virtual PIDs (returned from , converted) must convert back to global ones, since NVIDIA driver only recognize global PIDs.
    • and one post-stage, after NVIDIA ioctl invoked, cast global PIDs back

71614489144_ pic

61614489023_ pic

NOTE

tested on

  • kernel 4.15.0-136 x64 , docker 19.03.15 , NVIDIA driver 440.64
  • kernel 4.19.0-14 x64, NVIDIA driver 460.32

Afterwords, we'd like to maintain the project with fully tested and more kernels and NVIDIA drivers supported. However we sincerely hope NVIDIA will fix this with simplicity and professionalism. Thx.

mpu's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mpu's Issues

There may be a way to fix the error about kallsyms_lookup_name here

My OS is ubuntu20.04 LTS, I also want to use mpu to make the nvidia-smi in the container output the process id, but because the kernel version of ubuntu20 is too high, compiling mpu fails.
I then went and looked up some information to try to fix this on mpu's code, and then I found a workaround for another project.
xcellerator/linux_kernel_hacking#3
However, I don't know anything about C and I don't have the ability to add this fix to mpu myself.
So could you please, see if this way works? And try to fix it? THX.

Kernel Panic and Symbol Export Issues on CentOS Stream 9 with Kernel 5.14.0-404.el9

Summary

Encountering a kernel panic related to write_syscall function and an unresolved symbol error for kallsyms_lookup_name on CentOS Stream 9 running kernel version 5.14.0-404.el9.x86_64.

Detailed

Kernel Panic on Write to sys_call_table

While attempting to modify the sys_call_table, a kernel panic occurs due to write protection, which seems to be related to the pinned sensitive bits in CR0 and CR4 as of kernel version 5.3 (referenced here).

The current method of modification triggers a permissions violation, resulting in a system crash. Below is the kernel log snippet capturing the panic:

[ 4632.359092] BUG: unable to handle page fault for address: ffffffff998017a0
[ 4632.359654] #PF: supervisor write access in kernel mode
[ 4632.360207] #PF: error_code(0x0003) - permissions violation
[ 4632.360756] PGD 2104015067 P4D 2104015067 PUD 2104016063 PMD 80000021034000e1 
[ 4632.361323] Oops: 0003 [#1] PREEMPT SMP PTI
[ 4632.361882] CPU: 24 PID: 10286 Comm: insmod Kdump: loaded Tainted: P        W  OE     -------  ---  5.14.0-404.el9.x86_64 #1
[ 4632.362476] Hardware name: Inspur IIMS/IIMS, BIOS 4.0.05 08/22/2018
[ 4632.363068] RIP: 0010:mpu_init_ioctl_hook+0x8b/0xc0 [mpu]
[ 4632.363657] Code: 4c 89 25 80 44 00 00 48 89 1d 81 44 00 00 48 89 05 82 44 00 00 0f 20 c5 48 89 ef 48 81 e7 ff ff fe ff e8 48 b0 7a d7 48 89 ef <48> c7 83 80 00 00 00 b0 80 09 c1 e8 35 b0 7a d7 31 c0 5b 5d 41 5c
[ 4632.364876] RSP: 0018:ffffac0e1a02bda8 EFLAGS: 00010286
[ 4632.365491] RAX: 0000000000000000 RBX: ffffffff99801720 RCX: 0000000000000027
[ 4632.366121] RDX: 0000000000000027 RSI: ffffffff9a467b00 RDI: 0000000080050033
[ 4632.366740] RBP: 0000000080050033 R08: 80000000ffff8c20 R09: ffffac0e1a02bd28
[ 4632.367366] R10: 0000000000000001 R11: 000000000000001b R12: ffff99514b03e268
[ 4632.367998] R13: ffffac0e1a02be68 R14: 0000000000000003 R15: 0000000000000000
[ 4632.368632] FS:  00007f4d7e170740(0000) GS:ffff99d13b500000(0000) knlGS:0000000000000000
[ 4632.369286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4632.369940] CR2: ffffffff998017a0 CR3: 0000008164078003 CR4: 00000000007706e0
[ 4632.370604] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 4632.371271] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 4632.371937] PKRU: 55555554

Undefined Symbol kallsyms_lookup_name

Since kernel version 5.7, the kallsyms_lookup_name symbol is no longer exported, which is causing the module build process to fail with an undefined symbol error.

This issue has been previously mentioned in Issue #11 and PR #15 . The error log is as follows:

ERROR: modpost: "kallsyms_lookup_name" [/root/mpu/mpu.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:134: /root/mpu/Module.symvers] Error 1
make[2]: *** Deleting file '/root/mpu/Module.symvers'
make[1]: *** [Makefile:1841: modules] Error 2

Suggested Actions

  • For issue 1, I will raise a PR to modify CR0 directly using MOV operation.
  • For issue 2, merge PR #15.

Finally, in the above environment with Linux Container (Issue #12), nvidia-smi outputs the correct results.

Installation savely possible for these specs?

Hi,

Thanks for this kernel extension.

Is it save to install with the following specs?

  • NVIDIA-SMI: 470.86
  • Driver Version: 470.86
  • CUDA Version: 11.4
  • Kernel: 3.10.0-1160.45.1.el7.x86_64

Since I am not familiar with kernel extensions, I am a little hesitant to simply give it a try.
If no weird side effects can occur I would give it a shot and if it works you can update the README accordingly, adding the above specs as being compatible.

Thanks for your input.

Best regards
Lars

Edit*: Also, if you find the time, can you maybe explain how your solution is different from this repo https://github.com/gh2o/nvidia-pidns? It was also referenced in the corresponding nvidia-docker Github issue NVIDIA/nvidia-docker#179.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.