Git Product home page Git Product logo

Comments (7)

benfred avatar benfred commented on May 18, 2024

I think this is a great idea! I also need to get the OS thread id's for some other things I want to add (profiling native extensions, and doing a better job of detecting when a thread is idle).

I took a look at doing this on quickly on both Linux and OSX - and I think I can grab the native threadid relatively easily on OSX anyways (using the thread_info call asking for THREAD_IDENTIFIER_INFO returns basically the python threadid in the thread_handle field of returned struct). I'm still sorting out how to best get this on linux -

from py-spy.

nlevitt avatar nlevitt commented on May 18, 2024

I looked at this again and discovered that it's now been implemented. Awesome! Thanks!

Caveats:

  • the system thread ids are only printed when you pass --native
  • they are printed in hex, not the most convenient format for cross-referencing with ps or top

I'm a little puzzled by the choice of hex, even for python thread ids. Perhaps there's some context where hex is conventional? But I feel like I mostly see them in decimal. For example:

>>> threading.current_thread().ident
140520831489792
>>> threading.current_thread()
<_MainThread(MainThread, started 140520831489792)>

So I would advocate for printing all the thread ids in decimal. (Or it could print only the os thread id in decimal, or print both thread ids both in decimal or hex, etc)

from py-spy.

benfred avatar benfred commented on May 18, 2024

So - it's not quite done yet, which is why I haven't updated this issue =).

I still need to decouple the code to that matches the OS thread id to the python thread id so that it can work even if you don't get the native trace. This will also let us have much better estimates of whether the thread is idle (#92).

Matching the OS thread to the python thread id is relatively easy with OSX and Windows - my problem is getting this going on Linux. The current code is a bit of a hack and involves grabbing the python thread id from the RBX register of the top level frame of the native stack:

// On unix based systems w/ pthreads - the python thread id
// is contained in the RBX register of the last frame (aside from main frame)
// This is sort of a massive hack, but seems to work
#[cfg(unix)]
{
let next_bx = cursor.bx();
if next_bx != 0 && threadids.contains(&next_bx) {
python_thread_id = next_bx;
}
}
}

The problem with this code is that it involves unwinding the native stack for the thread - which we obviously have to do in with the --native option, but is unnecessary otherwise. Also, unwinding native stacks still doesn't work all that reliably yet here. (see #2).

One option I was toying with for this is calling the PyThread_get_thread_ident function for each native thread on linux systems using ptrace (as shown here https://github.com/eklitzke/ptrace-call-userspace) instead of reading the top level RBX register. I'm not sure if this is a better idea than just trying to get better native stack unwinding going - and continuing on using the RBX hack though =(.

I agree about reporting thread ids in decimal - will make that change in the next dev release.

from py-spy.

nlevitt avatar nlevitt commented on May 18, 2024

Wow, thanks for all your work on this!

P.S. I've always found it annoying that python hides the OS thread id.

from py-spy.

benfred avatar benfred commented on May 18, 2024

This PR refactors so that we will usually have the OS thread id in --dump: #123.

from py-spy.

benfred avatar benfred commented on May 18, 2024

Can install with pip install py-spy==0.2.0.dev2

from py-spy.

nlevitt avatar nlevitt commented on May 18, 2024

Thanks!

from py-spy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.