Git Product home page Git Product logo

Comments (10)

ltfish avatar ltfish commented on August 28, 2024
s = pg.found[0].state

print s.se.any_int(s.regs.rdi)
print s.se.any_int(s.regs.rsi)

Here s is the final state, not the initial state at the entrance of this function. Values in RDI and RSI might not be the same as those in initial state since RDI and RSI might be used during the (symbolic) execution of this function. You want to call .regs.rdi and .regs.rsi on your initial state s, which is overwritten later by statement s = pg.found[0].state.

from angr-doc.

mrphrazer avatar mrphrazer commented on August 28, 2024

Hi,

thanks for the quick response! Ok, lets change it to

initial = b.factory.blank_state(addr=start_address)
[...]
found_state = pg.found[0].state

If I get you right, then initial.se.any_int(found_state.regs.rdi) should work? Well, neither this nor, the other way round, found_state.se.any_int(initial.regs.rdi) work as expected.

Perhaps I misunderstand the logic, but I expect to derive a set of constraints that depend on the input.

For instance, assume the following pseucode:

x = 2 *  RDI_init;
y = x + 7
RDI = 5 + y

if (RDI == 100) goto A else B
[...]

Assume I want to reach C. Then, I expect the constraints to look as 2 * RDI_init + 7 + 5 == 100 for the first block. In the end, if the set of path constraints is satisfiable, RDI_init can be derived.

Or do I have to declare RDI explicitly as symbolic or keep track of the changes otherwise to retrieve a concrete input?

Thanks in advance! :)

from angr-doc.

ltfish avatar ltfish commented on August 28, 2024

I think found_state.se.any_int(initial_regs.rdi) should just work. It will give you one solution from all possible solutions.

from angr-doc.

mrphrazer avatar mrphrazer commented on August 28, 2024

found_state.se.any_int(initial_regs.rdi) and similar expressions return 0, every time. The only combination that works is using found_state registers for found_state.se. But as you stated, this won't work in the most (of my) test cases.

For instance, this C code and the corresponding assembly code. Then, applying the script below, the initial regs will be zero. Those values don't satisfy the path to the return 1.

import angr
b = angr.Project("./test")
initial_state = b.factory.blank_state(addr=0x400546)
pg = b.factory.path_group(initial_state, immutable=False)
pg.explore(find=0x400587)
found_state = pg.found[0].state

# does not work
print found_state.se.any_int(initial_state.regs.rdi)
print found_state.se.any_int(initial_state.regs.rsi)

# works for this case
print found_state.se.any_int(found_state.regs.rdi)
print found_state.se.any_int(found_state.regs.rsi)

from angr-doc.

zardus avatar zardus commented on August 28, 2024

I think the confusion stems from the fact that state.regs.rdi gets you
the value at rdi, not some semantic representation of rdi itself. So, in
your example, your initial state's rdi has nothing at all to do with the
final result of rdi. angr is telling you that the initial value of rdi can
be anything, which is true since that initial value is clobbered at
0x40054a.

found_state.se.any_int(initial_regs.rdi) and similar expressions return
0, every time. The only combination that works is using found_state
registers for found_state.se. But as you stated, this won't work in the
most (of my) test cases.

For instance, this
https://gist.github.com/mrphrazer/07fc64397e2614d88be3 C code and the
corresponding assembly code
https://gist.github.com/mrphrazer/b18dc1dd990d89dd4e19. Then, applying
the script below, the initial regs will be zero. Those values don't satisfy
the path to the return 1.

import angr
b = angr.Project("./test")
initial_state = b.factory.blank_state(addr=0x400546)
pg = b.factory.path_group(initial_state, immutable=False)
pg.explore(find=0x400587)
found_state = pg.found[0].state

does not workprint found_state.se.any_int(initial_state.regs.rdi)print found_state.se.any_int(initial_state.regs.rsi)

works for this caseprint found_state.se.any_int(found_state.regs.rdi)print found_state.se.any_int(found_state.regs.rsi)


Reply to this email directly or view it on GitHub
#15 (comment).

from angr-doc.

zardus avatar zardus commented on August 28, 2024

And I just noticed that I basically repeated what Fish said :-)
On Nov 5, 2015 1:58 PM, "Yan" [email protected] wrote:

I think the confusion stems from the fact that state.regs.rdi gets you
the value at rdi, not some semantic representation of rdi itself. So, in
your example, your initial state's rdi has nothing at all to do with the
final result of rdi. angr is telling you that the initial value of rdi can
be anything, which is true since that initial value is clobbered at
0x40054a.

found_state.se.any_int(initial_regs.rdi) and similar expressions return
0, every time. The only combination that works is using found_state
registers for found_state.se. But as you stated, this won't work in the
most (of my) test cases.

For instance, this
https://gist.github.com/mrphrazer/07fc64397e2614d88be3 C code and the
corresponding assembly code
https://gist.github.com/mrphrazer/b18dc1dd990d89dd4e19. Then, applying
the script below, the initial regs will be zero. Those values don't satisfy
the path to the return 1.

import angr
b = angr.Project("./test")
initial_state = b.factory.blank_state(addr=0x400546)
pg = b.factory.path_group(initial_state, immutable=False)
pg.explore(find=0x400587)
found_state = pg.found[0].state

does not workprint found_state.se.any_int(initial_state.regs.rdi)print found_state.se.any_int(initial_state.regs.rsi)

works for this caseprint found_state.se.any_int(found_state.regs.rdi)print found_state.se.any_int(found_state.regs.rsi)


Reply to this email directly or view it on GitHub
#15 (comment).

from angr-doc.

ltfish avatar ltfish commented on August 28, 2024

@mrphrazer Can I get a copy of the binary that you are working on? I'll write an angr script for you.

@zardus When are you coming?

from angr-doc.

mrphrazer avatar mrphrazer commented on August 28, 2024

Oh awesome guys, thanks for your help! Indeed, I expected different behaviour which caused the confusion. Since the path is satisfiable, there exist valid inputs. I should restate the questions how to parse those exactly.

So, seeing a working example will be help for sure. Here is the binary. Thanks, again! :)

from angr-doc.

ltfish avatar ltfish commented on August 28, 2024

Here it is:

import angr

num_0 = None
num_1 = None

def hook_atoi(state):
    if state.se.is_true(state.ip == 0x4005b2):
        state.regs.rax = state.se.ZeroExt(32, num_0)
    elif state.se.is_true(state.ip == 0x4005c8):
        state.regs.rax = state.se.ZeroExt(32, num_1)
    else:
        # Should never get here
        raise Exception('Oops')

def main():
    p = angr.Project("test_binary", load_options={'auto_load_libs': False})
    s = p.factory.blank_state(addr=0x400595)

    global num_0, num_1

    num_0 = s.se.BVS('num_0', 32)
    num_1 = s.se.BVS('num_1', 32)

    # Set two arguments by hooking atoi and returning the corresponding variable
    p.hook(0x4005b2, hook_atoi, length=5)
    p.hook(0x4005c8, hook_atoi, length=5)

    pg = p.factory.path_group(s, immutable=False)
    pg.explore(find=0x4005ed)

    # Now we should have one path found
    assert len(pg.found) == 1
    # There are many solutions... we take an arbitrary one from them, and 
    # further constrain num_0 to be that single solution
    num_0_ints = pg.found[0].state.se.any_n_int(num_0, 20)
    print "num_0 has more than %d solutions" % len(num_0_ints)
    num_0_int = num_0_ints[0]

    # With out constraining, there are many possible solutions to num_1
    num_1_ints = pg.found[0].state.se.any_n_int(num_1, 20)
    print "num_1 has more than %d solutions" % len(num_1_ints)
    # Here is how you can further constrain it
    copied_state = pg.found[0].state.copy()
    copied_state.add_constraints(num_0 == num_0_int)
    num_1_ints = copied_state.se.any_n_int(num_1, 20)
    print "After constraining variable num_0, there are %d solutions to num_1" % len(num_1_ints)
    print "Try running ./test_binary %d %d!" % (num_0_int, num_1_ints[0])

if __name__ == "__main__":
    main()

And there is the output on my machine:

num_0 has more than 20 solutions
num_1 has more than 20 solutions
After constraining variable num_0, there are 1 solutions to num_1
Try running ./test_binary 1 2306412234!

Some words:

  • Because I'm running an internal branch of angr, you might want to replace s.se.BVS with s.se.BV to make this script work on the current public release version of angr. In the next release, it will be s.se.BVS.
  • Our support to atoi is very limited for now. There is a better version in development by @salls, but it didn't work with this example (I don't know why). We'll look into it further. Generally, supporting strings is difficult :-( If you are just interested in having a toy example to run, just set those variables by hooking atoi functions should suffice.
  • Everything else is pretty straightforward and well-commented. Let me know if anything is not clear for you.

from angr-doc.

mrphrazer avatar mrphrazer commented on August 28, 2024

@ltfish Really nice, thank you very much! I got the idea and were able to apply this successfully to the other test cases :)

from angr-doc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.