Comments (20)
@crepererum I think that htop problem might be the same as this : https://jvns.ca/blog/2018/01/28/mac-freeze/ - which we should handle (with the sleep call here: https://github.com/rbspy/proc-maps/blob/96b274b826d7f9246856986af632b33672e9a7d9/src/mac_maps/mod.rs#L144 and I think apple has fixed the kernel bug in the macos version he's using). The htop issue is mentioned on this twitter thread about that post anyways: https://twitter.com/b0rk/status/957498366606368768
@spencerwilson-optimizely I think this program should reproduce the problem (without all the other rust/py-spy code getting in the way and confusing the issue):
// file: task_for_pid_test.cc
#include <mach/mach_init.h>
#include <mach/port.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, const char * argv[]) {
if (argc <= 1) {
printf("usage %s <pid>\n", argv[0]);
return 1;
}
int pid = atoi(argv[1]);
mach_port_name_t task = MACH_PORT_NULL;
kern_return_t kr = task_for_pid(mach_task_self(), pid, &task);
if (kr != KERN_SUCCESS) {
printf("failed to call task_for_pid on pid %i: %x\n", pid, kr);
return kr;
}
printf("success!\n");
return 0;
}
compile with g++ task_for_pid_test.cc -o task_for_pid_test
run with task_for_pid_test 12345
where 12345 is your pid.
from py-spy.
So - it sort of looks like something killed py-spy =(by going 'kill -9' or similar) =(.
Can you figure out what process issued the SIGKILL? There are probably a bunch of ways of getting this, but an easy first stab might be to use dtrace. Running these commands:
curl http://www.brendangregg.com/DTrace/kill.d > kill.d
sudo /usr/sbin/dtrace -qs kill.d
will print out a series of lines with the sending process info and what signal is being sent every time a signal happens on your machine. If you filter down to '9' signals sent at the same time py-spy is killed, you should get the process that is killing it.
from py-spy.
Good idea. I found that each attempt to run sudo py-spy -p $PID -f hott
resulted in a line like
FROM COMMAND SIG TO RESULT
9719 sudo 9 9719 0
sudo is killing itself.
I added some debug logging to sudo via /etc/sudo.conf and obtained these logs: gist.
I'm confident that the security policy in effect is permitting the command to go through:
Sep 26 22:20:10 sudo[9924] policy plugin returns 1
cross-referencing w/
The check_policy() function returns 1 if the command is allowed, 0 if not allowed
from man sudo_plugin. Also note that in the ultra debug mode logs its attempting to exec the py-spy command.
I suspect what's most relevant is
Sep 26 22:20:10 sudo[9924] failed to read child status: EOF
in part because that's the single line that gets logged if I decrease the log level to warn
.
Probably worth mentioning at this point, too:
$ sudo --version
Sudo version 1.8.17p1
Sudoers policy plugin version 1.8.17p1
Sudoers file grammar version 45
Sudoers I/O plugin version 1.8.17p1
from py-spy.
For reference, here's the line of code that's coming from.
from py-spy.
More data, demoing the 'direct invocation' style + showing I'm not trying to use /usr/bin/python:
09:05 ~/code/optimizely-integrations
$ cat foo.py
while True:
1 + 1
09:05 ~/code/optimizely-integrations
$ sudo py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
Killed: 9
from py-spy.
Ah, looks like the process gets created, but as a child of /sbin/launchd:
09:27 ~/code/optimizely-integrations
# via https://askubuntu.com/q/56266
$ logpid() { while sleep 1; do ps -p $1 -o pcpu= -o pmem= ; done; }
09:27 ~/code/optimizely-integrations
$ sudo py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
Killed: 9
09:27 ~/code/optimizely-integrations
$ pstree -p 21178
-+= 00001 root /sbin/launchd
\--- 21178 root /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
09:27 ~/code/optimizely-integrations
$ logpid 21178
99.0 0.0
100.0 0.0
100.0 0.0
99.7 0.0
100.0 0.0
100.0 0.0
100.0 0.0
^C
from py-spy.
Hmm, weird.
Instead of launching py-spy like sudo py-spy ..
can you go like sudo su
to get a shell session as root, and then run py-spy from that ? I'm interested to see if sudo still kills itself in this case.
Can you also enable logging in py-spy? (RUST_LOG=info py-spy --pid 123456
).
I think the error message from using the system python would be different. (also the system python on high sierra is 2.7.10 instead).
from py-spy.
13:12 ~
$ whoami
root
13:12 ~
$ py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
Killed: 9
13:13 ~
$ echo $?
137
13:13 ~
$ RUST_LOG=info py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
INFO 2018-09-27T20:15:24Z: py_spy: Command line args: ArgMatches { args: {"rate": MatchedArg { occurs: 0, indices: [3], vals: ["200"] }, "python_program": MatchedArg { occurs: 2, indices: [1, 2], vals: ["/Users/swilson/.pyenv/versions/2.7.12/bin/python", "foo.py"] }, "duration": MatchedArg { occurs: 0, indices: [4], vals: ["2"] }}, subcommand: None, usage: Some("USAGE:\n py-spy [FLAGS] [OPTIONS] --pid <pid> [python_program]...") }
INFO 2018-09-27T20:15:24Z: py_spy::python_spy: Found process binary @ '/Users/swilson/.pyenv/versions/2.7.12/bin/python2.7'
Killed: 9
13:15 ~
$ pstree -p 27545
-+= 00001 root /sbin/launchd
\--- 27545 root /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
Strangely, Brendan Gregg's dtrace script isn't listing any SIGKILLs, unlike before.
When I run the looping script w/o py-spy and then manually kill it, the signal is observed by dtrace:
$ python foo.py &
[1] 27609
13:21 ~
$ kill -9 27609
# in another shell...
dtrace: system integrity protection is on, some features will not be available
FROM COMMAND SIG TO RESULT
27436 sh 9 27609 0
I'm starting to wonder if system integrity protection might be involved. I attempted to trace system calls to see if I could find what was sending or printing the "Killed: 9", but no luck yet
13:35 ~
$ csrutil status
System Integrity Protection status: enabled.
13:35 ~
$ dtruss py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
dtrace: system integrity protection is on, some features will not be available
dtrace: failed to execute py-spy: dtrace cannot control executables signed with restricted entitlements
from py-spy.
hm, I have another machine, macOS 10.13.4 w/ SIP enabled, and sudo py-spy -- $(pyenv which python) foo.py
works fine there.
from py-spy.
I don't think this problem is related to SIP - iirc the error messages when hitting SIP are a permissions denied error (also it's not getting that far =).
Based off the logs, py-spy is getting killed when reading the virtual memory maps - which really only calls task_for_pid followed by a couple of mach_vm_region calls: https://github.com/rbspy/proc-maps/blob/96b274b826d7f9246856986af632b33672e9a7d9/src/mac_maps/mod.rs#L93-L95
You should be able to see these calls running with something like
sudo dtrace -n 'pid$target::mach_vm_region:entry' \
-n 'pid$target::task_for_pid:entry' \
-c 'py-spy -f o.svg -- python foo.py'
which should list one task_for_pid call followed by a bunch of mach_vm_region (to get the memory maps). if this works there will also be an extra task_for_pid call at the end (to get the handle to read the process memory).
I'm really not sure why this would cause a sigkill though =( Do you have any antivirus software running? Or is there any other major differences in the environment between your machines?
from py-spy.
hmm - actually you should be able to use dtruss, and if you can't you probably can't use dtrace either 🤦♂️ .
Can you use dtruss/dtrace on your python (without py-spy) ? like does dtruss python foo.py
bring up the same error?
I'll try out re-enabling SIP tomorrow to test out.
from py-spy.
I identified the issue with dtruss:
21:55 ~
$ which py-spy
/Users/swilson/.pyenv/shims/py-spy
21:55 ~
$ head -n1 $(which py-spy)
#!/usr/bin/env bash
sudo dtruss py-spy ...
attempts to trace /usr/bin/env, which is a no-go under SIP (just like attempting to py-spy on /usr/bin/python, I suspect). Substituting with sudo dtruss $(pyenv which py-spy) ...
works! So I think that's a red herring. Sorry for that distraction >_>
The output of the command you suggested above:
$ sudo dtrace -n 'pid$target::mach_vm_region:entry' -n 'pid$target::task_for_pid:entry' -c "$(pyenv which py-spy) -f o.svg -- python foo.py"
dtrace: system integrity protection is on, some features will not be available
dtrace: description 'pid$target::mach_vm_region:entry' matched 1 probe
dtrace: description 'pid$target::task_for_pid:entry' matched 2 probes
dtrace: pid 14131 has exited
CPU ID FUNCTION:NAME
5 4734 task_for_pid:entry
from py-spy.
Do you have any antivirus software running? Or is there any other major differences in the environment between your machines?
Good idea. I upgraded the other to the same version of macOS and ensured same version of Python and py-spy, and still the one works, one doesn't. There is an antivirus running, Cb Defense. Unfortunately, I can't turn it off without a trip over to our IT team (and even no guarantees then ;) ). I'll pitch it to them to try to test that hypothesis, though. I don't know anything about how antivirus software works, but that's kinda freaky if it can kill things without being observed by dtrace.
from py-spy.
hmm , it looks like it's getting killed in task_for_pid then? I guess we'd have to also log the task_for_pid return too to be sure (instead of just the entry)
sudo dtrace -n 'pid$target::task_for_pid:' \
-c 'py-spy -f o.svg -- python foo.py'
which should dump out something like
CPU ID FUNCTION:NAME
2 28269 task_for_pid:entry
2 28270 task_for_pid:0
2 28271 task_for_pid:3
2 28272 task_for_pid:8
2 28273 task_for_pid:a
2 28268 task_for_pid:return
from py-spy.
dtrace: system integrity protection is on, some features will not be available
dtrace: description 'pid$target::task_for_pid:' matched 14 probes
dtrace: pid 28285 has exited
CPU ID FUNCTION:NAME
2 4991 task_for_pid:entry
2 4992 task_for_pid:0
2 4993 task_for_pid:3
2 4994 task_for_pid:8
Reran it multiple times and the output seemed consistently that ^
from py-spy.
I think I've seen this problem before 🤔
hishamhm/htop#682
from py-spy.
@benfred : In one shell I kicked off python foo.py
, and in another:
$ ./task_for_pid_test 30533
Killed: 9
Really nice work! You called it 🔮
Again, nothing appears in kill.d
's output with each run of ./task_for_pid_test
.
from py-spy.
It's great that we figured out where it's happening - but I'm still not sure of why =(.
It might be worth checking to see if you can disable your antivirus and try it out. I'm leaning towards it being a 3rd party program killing the process rather than something with os restrictions like SIP or taskgated (because I'd expect the task_for_pid call to fail rather than cause the process to get killed if this call is restricted by the OS). It still might be worth turning off SIP and verifying though.
Is there anything in the system log when this happens ? (Like open Console.app -> and checking system.log or /var/log). Or is there anything in the antivirus logs?
from py-spy.
11:27 ~
$ tail -f /var/log/system.log | rg task_for_pid
# I run ./task_for_pid_test in another shell...
Sep 28 11:27:05 us-swilson CbDefense_Svc[8660]: The application "/Users/swilson/task_for_pid_test" attempted to open the process "python3.6" by calling the function "task_for_pid". The operation was blocked and the application terminated by CbDefense
Filed a ticket with our IT team to see if an exception can safely be made.
from py-spy.
awesome! glad we got to the bottom of this.
from py-spy.
Related Issues (20)
- Elf sbss issue HOT 1
- Option to get native thread ID with best effort HOT 2
- Encounter os error 31 with py-spy record
- Python 3.12 support HOT 10
- Python 3.11 --native profiling fails because of bpo-45256 HOT 1
- Seg fault on py-spy record --subprocesses
- Test `--native` option in CI
- redirect to stdoutput
- py_spy fails to find interpreter on Windows HOT 4
- Assertion errors when HOT 2
- `--native` HOT 2
- Consider moving CLI functionality behind a feature flag
- attempt to subtract with overflow in console_viewer.rs
- libproc v0.13.0 doesn't build for me
- Programmatically start/stop py-spy recording HOT 1
- Panics at times when sampling a Django app HOT 1
- No source distribution files available in the Pypi site
- Flame graphs cpu on-off
- Add build instruction to README
- Use as python library HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from py-spy.