Comments (9)
+1
I've been thinking along these lines as well. It should be possible to identify which registers that are used as function arguments by locating instructions in the function which use a register prior to it being defined in the use-def chains. The same approach could be applied to arguments passed on the stack, but this would require tracking updates to esp through a range of instructions (e.g. push/pop, add esp, sub ebp, ...).
from mcsema.
Are you looking to have some kind of dispatcher or wrapper function associated with each entry points, which saves/restore the volatile registers on stack?
from mcsema.
Are you looking to have some kind of dispatcher or wrapper function associated with each entry points, which saves/restore the volatile registers on stack?
- Only external function calls would require a wrapper, as those functions cannot be modified and must therefore retain the original mapping between registers and stack variables to function arguments.
- However, internal functions (functions decompiled by MC-Sema and only accessible from within the binary) may be redefined such that all function arguments (registers for now, maybe stack variables later?) are identified and propagated to the function declaration.
Internal functions which are exported may use a combination of these techniques. To provide a consistent handling of decompiled functions, while still allowing external code to call these functions, they may be decompiled using the method described in 2, and a wrapper function (as per item 1) may be defined which translates the native context (based on the original calling convention) to the context expected by the function.
from mcsema.
Hi Robin, Thanks for explaining. This was one of my observation that Mc-sema passes struct.regs
while calling an internal functions, which sometime lead to increase in the number of instructions for translated block.
It will be nice addition to have some kind of register tracking system which can learn the register assignment/usage pattern across caller/callee functions and get some information about function signature, and which can be used during declaration. For arguments passed on stack, tracking operations involving esp or ebp
can give the clue, considering if there was any native asm code used, it followed the correct calling conventions.
from mcsema.
Hi @pgoodman,
Does closing this issue mean that the feature has been implemented in MC-Semantics? If so, would you care to point us in the direction of where we may find information regarding how the problem was solved, and what the design of the solution may look like? Was this developed in the new_reg_assign branch?
Hope to hear from you.
Cheers /u
from mcsema.
Hi @mewmew,
I closed this ticket because it has been open for some time without a solid plan to move in this direction. I have some plans related to possibly passing some registers via arguments, that could take some time though.
I think the key thing was that this idea was slightly uninformed, and didn't consider the alternatives. The bitcode produced by mcsema since merging the new_reg_assign branch is substantially different than what it was like in 2015 when this issue was opened. The latest approach involves directly addressing into the register state structure (pointer passsed as an argument). The old approach was to copy values out of the register state struct and into alloca
d variables.
The next planned improvement that will touch on performance will be Issue #91. After that, I may look into lifting a subset of registers as arguments to lifted functions. The number of registers passed this will vary from function to function.
from mcsema.
@pgoodman Thank you for the quick reply and for providing some background on the work that has been done and which direction it is heading in.
I would be very interested in a high-level road map for the MC-Semantics project. What are the current big challenges, and how are they prioritized? What is the short, medium and semi-long term goals of the project. I think a road map such as this may help the project by focusing on solving larger problems in the long run, instead of micro-managing to solve smaller issues.
Just to be clear, I am not suggesting that MC-Semantic is doing one or the other. I just wish to gain some further insight into the project, as I'm very interested in the problem it sets out to solve, and would love for the project to have a bright future.
So, to sum up. Do you know if such a roap map exists currently? If so, is it publicly available, or internal to Trail of Bits?
Cheerful regards /u
from mcsema.
Peter,
I only now realized you are from Canada. May I ask whereabouts? I am travelling for the next six months throughout Canada, and currently staying at a hostel in Toronto. Love it!
Cheers /u
from mcsema.
from mcsema.
Related Issues (20)
- Docker build error HOT 1
- dyninst frontend is not gettting built on Linux HOT 4
- Segfault while running lifted binary HOT 1
- How not to be explicit about "runtime's memory" HOT 4
- Decompiling Windows binaries (32bit and 64bit) does not work at all HOT 3
- Building mcsema with clang12 error HOT 10
- Official support for running mcsema with rizin/cutter HOT 2
- build error on ubuntu 20.04
- Dockerfile is not up-to-date with remill dependency HOT 2
- Feature request: Improve installation process with precompiled binaries HOT 1
- to see details in the generated 'xxx.cfg'
- fatal error: 'anvill/Program.h' file not found HOT 1
- Can you walk through the example on docs/McSemaWalkthrough.md and check if it still works?
- Unable to build the docker image
- error while translating function with function pointer as parameter HOT 5
- wsl-1.0 ubuntu20+win ida7.5 counter a error of
- Converting 64 bit program to 32 bit HOT 2
- Use mcsema with IDA Home
- Issue in disassembling binaries
- how to make llvm ir "store" volatile? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mcsema.