Git Product home page Git Product logo

Comments (8)

eric-wieser avatar eric-wieser commented on August 19, 2024

Context: writing withhacks.frame_utils.extract_code

from bytecode.

vstinner avatar vstinner commented on August 19, 2024

Hi, bytecode provides two very different kind of instructions: abstract instructions (Instr) and concrete instructions (ConcreteInstr). Which one do you want?

Building a bytecode objects from the same code is trivial, as you showed: ConcreteBytecode.from_code(frame.code)

The problem is for abstract bytecode: an abstract instruction is strongly linked to an abstract bytecode object, for example jumps uses label objects stored in the bytecode object. frame.f_lasti is hard to compute from an abstract bytecode, because you have to assemble the abstract bytecode to concrete bytecode, resolve jumps, etc.

So maybe we should limit ourself to concrete bytecode. I suggest that you add a method to ConcreteBytecode to get an instruction by its offset: return None if the offset is not exactly the start of an instruction, raise an IndexError if the offset is negative or out of the code.

Why not returning directly the instruction in your promote_concrete_index() function?

from bytecode.

eric-wieser avatar eric-wieser commented on August 19, 2024

Which one do you want?

Ideally, I want to convert frame.f_lasti into an index for Bytecode.from_code(frame.f_code), but the easiest way to do that seems to be to go via an index into ConcreteBytecode.from_code(frame.f_code) first

frame.f_lasti is hard to compute from an abstract bytecode

I agree. But in this case, we have the intermediate concrete bytecode to work with too. What I'm looking for in my second bullet point is a mapping between ConcreteInstr and Instr objects - am I correct in assuming that the number of instructions in b and c where b = c.to_bytecode() are always the same?

suggest that you add a method to ConcreteBytecode to get an instruction by its offset: return None if the offset is not exactly the start of an instruction, raise an IndexError if the offset is negative or out of the code.

Sounds good, other than...

Why not returning directly the instruction in your promote_concrete_index() function?

Because I want to analyze frame.f_code[frame.f_lasti:], so need the index for slicing.

from bytecode.

vstinner avatar vstinner commented on August 19, 2024

am I correct in assuming that the number of instructions in b and c where b = c.to_bytecode() are always the same?

Conversion from concrete to abstract bytecode should keep the same number of ConcreteInstr/Instr, but it adds Label objects. So a naive bytecode[index] doesn't work.

Because I want to analyze frame.f_code[frame.f_lasti:], so need the index for slicing.

Oh ok, so it makes sense to add a get_instr_index(offset) method. Getting the instruction object is as simple as bytecode[index] anyway.

from bytecode.

eric-wieser avatar eric-wieser commented on August 19, 2024

Getting the instruction object is as simple as bytecode[index] anyway.

Not quite, because bytecode[index] might be a SetLineNo, right? Also, index might ==len(self), which is ok for slicing, but not for lookup

from bytecode.

vstinner avatar vstinner commented on August 19, 2024

get_instr_index(offset) is the most explicit.

index(value=None, code_offset=None), which overloads Sequence.index

Hum, what does Bytecode.index(instr) currently?

from bytecode.

eric-wieser avatar eric-wieser commented on August 19, 2024

Hum, what does Bytecode.index(instr) currently [do]?

Return the index such that Bytecode[i] == instr

get_instr_index(offset) is the most explicit.

I worry that there needs to be some explicit mention of the code object in the function or argument name

from bytecode.

rocky avatar rocky commented on August 19, 2024

@eric-wieser Although I confess that I find it hard to follow the thread here and meandering train, I have a general sense of the overall kind of thing has been done through various combinations of programs I've written. At a high level, the Python trepan debuggers have a disassemble command which will show disassembly at a given frame offset. So in a sense they can retrieve instructions starting from that point.

Furthermore the debuggers can show you a deparse of the code around that point, using the deparse command. Underneath for either command you have a list of bytecode for the disassembly command, or a parse tree where the leaves are instructions as a namedtuple for the deparse command.

The debuggers rely on the libraries uncompyle6 and xdis. But there is a library that corresponds more to bytecode, although in some respects is more primitive, called xasm.

Probably ideal would be for my projects to align more with this library.

from bytecode.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.