Git Product home page Git Product logo

Comments (9)

mewmew avatar mewmew commented on April 28, 2024

+1

I've been thinking along these lines as well. It should be possible to identify which registers that are used as function arguments by locating instructions in the function which use a register prior to it being defined in the use-def chains. The same approach could be applied to arguments passed on the stack, but this would require tracking updates to esp through a range of instructions (e.g. push/pop, add esp, sub ebp, ...).

from mcsema.

kumarak avatar kumarak commented on April 28, 2024

Are you looking to have some kind of dispatcher or wrapper function associated with each entry points, which saves/restore the volatile registers on stack?

from mcsema.

mewmew avatar mewmew commented on April 28, 2024

Are you looking to have some kind of dispatcher or wrapper function associated with each entry points, which saves/restore the volatile registers on stack?

  1. Only external function calls would require a wrapper, as those functions cannot be modified and must therefore retain the original mapping between registers and stack variables to function arguments.
  2. However, internal functions (functions decompiled by MC-Sema and only accessible from within the binary) may be redefined such that all function arguments (registers for now, maybe stack variables later?) are identified and propagated to the function declaration.

Internal functions which are exported may use a combination of these techniques. To provide a consistent handling of decompiled functions, while still allowing external code to call these functions, they may be decompiled using the method described in 2, and a wrapper function (as per item 1) may be defined which translates the native context (based on the original calling convention) to the context expected by the function.

from mcsema.

kumarak avatar kumarak commented on April 28, 2024

Hi Robin, Thanks for explaining. This was one of my observation that Mc-sema passes struct.regs while calling an internal functions, which sometime lead to increase in the number of instructions for translated block.
It will be nice addition to have some kind of register tracking system which can learn the register assignment/usage pattern across caller/callee functions and get some information about function signature, and which can be used during declaration. For arguments passed on stack, tracking operations involving esp or ebp can give the clue, considering if there was any native asm code used, it followed the correct calling conventions.

from mcsema.

mewmew avatar mewmew commented on April 28, 2024

Hi @pgoodman,

Does closing this issue mean that the feature has been implemented in MC-Semantics? If so, would you care to point us in the direction of where we may find information regarding how the problem was solved, and what the design of the solution may look like? Was this developed in the new_reg_assign branch?

Hope to hear from you.

Cheers /u

from mcsema.

pgoodman avatar pgoodman commented on April 28, 2024

Hi @mewmew,

I closed this ticket because it has been open for some time without a solid plan to move in this direction. I have some plans related to possibly passing some registers via arguments, that could take some time though.

I think the key thing was that this idea was slightly uninformed, and didn't consider the alternatives. The bitcode produced by mcsema since merging the new_reg_assign branch is substantially different than what it was like in 2015 when this issue was opened. The latest approach involves directly addressing into the register state structure (pointer passsed as an argument). The old approach was to copy values out of the register state struct and into allocad variables.

The next planned improvement that will touch on performance will be Issue #91. After that, I may look into lifting a subset of registers as arguments to lifted functions. The number of registers passed this will vary from function to function.

from mcsema.

mewmew avatar mewmew commented on April 28, 2024

@pgoodman Thank you for the quick reply and for providing some background on the work that has been done and which direction it is heading in.

I would be very interested in a high-level road map for the MC-Semantics project. What are the current big challenges, and how are they prioritized? What is the short, medium and semi-long term goals of the project. I think a road map such as this may help the project by focusing on solving larger problems in the long run, instead of micro-managing to solve smaller issues.

Just to be clear, I am not suggesting that MC-Semantic is doing one or the other. I just wish to gain some further insight into the project, as I'm very interested in the problem it sets out to solve, and would love for the project to have a bright future.

So, to sum up. Do you know if such a roap map exists currently? If so, is it publicly available, or internal to Trail of Bits?

Cheerful regards /u

from mcsema.

mewmew avatar mewmew commented on April 28, 2024

Peter,

I only now realized you are from Canada. May I ask whereabouts? I am travelling for the next six months throughout Canada, and currently staying at a hostel in Toronto. Love it!

Cheers /u

from mcsema.

pgoodman avatar pgoodman commented on April 28, 2024

from mcsema.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.