Comments (34)
That binary in particular is a position-independent executable (PIE), which means it has no base address. CLE tells you whether the main binary contains position-independent code via loader.main_bin.pic
, which is true in your case.
At some point in the dev process we made the decision to have PIEs automatically loaded with a base address of 0x400000, because it's useful when address zero doesn't map to anything.
If you like, we can add a logger message warning that the binary is being loaded at that base address for PIEs? If you want it loaded at zero, you can specify that with b = angr.Project('filename', load_options={'main_opts': {'custom_base_addr': 0}})
or... something like that? check the docstring for cle.Loader.
from angr-doc.
Yeah, that's what I've ended up doing. A warning would be good though.
from angr-doc.
On the same problem, I've found this:
UnsupportedSyscallError('no syscall 13 for arch AMD64',)
The binary is creating an alarm for itself (connection timeout i think). For now I'll just add a handler or something to jump over it, but are signals a general limitation of angr? feature in the future? etc
from angr-doc.
There are a lot, at lot a lot of unsupported syscalls. If you want something to work, you're probably better off implementing SimProcedures for the individual library functions your binaries use, instead of implementing syscalls.
On the other hand, that alarm behavior is really unimportant for analysis purposes, and the default behavior of an unsupported syscall is to do nothing and return an unconstrained symbolic value, so that error is probably nothing to worry about.
from angr-doc.
sorry to keep pestering. i've noticed that the explorer seems to run off on its own sometimes. is there a way for angr to tell you what portions of the code it is spending most time on so that the end user can try to optimize around it?
from angr-doc.
There isn't at the moment, but that'd be an interesting feature. In the meantime, this code could be useful:
import itertools
import collections
p = angr.Project('/bin/bash')
pg = p.factory.path_group().step(170)
print collections.Counter(itertools.chain(*pg.mp_active.addr_backtrace.mp_items))
That'll step bash 170 times (at which point, it splits a few times), then grabs the backtrace of every path, combines them all, and passes them to collections.Counter, which gives you a count for the most common basic blocks executed.
This suffers from the problem that a basic block is counted once per path where it appears rather than every time it was executed. That is, if you execute blocks A->B
and then branch and execute B->C1
and B->C2
in two different paths, A and B will end up being counted twice. Solving that requires analyzing the path history and is probably a decent-sized pain in the ass.
One other thing you can do is monkeypatch the lifter. Of course, this is super ugly and you should never do it, but if you were to do it, it'd look like this:
import angr
import collections
p = angr.Project('/bin/bash')
_old_lifter = p.factory.block
counter = collections.Counter()
def count_block(*args, **kwargs):
counter[args[0]] += 1
return _old_lifter(*args, **kwargs)
p.factory.block = count_block
pg = p.factory.path_group().step(170)
print counter.most_common(10)
from angr-doc.
Alternately, if you want an easier approach: For a given path, you can look at all the steps its made with path.backtrace
. Look through all the paths (the ones you care about are gonna be in surveyor.active
and surveyor.spilled
), checking for ones where len(path.backtrace)
is pretty huge, and then look where it's running a bunch of blocks you weren't expecting. In all likelyhood it's just stuck inside libc, in which case the solution is to write a SimProcedure for whatever library function you were calling.
from angr-doc.
Does angr support global variable spaces? I'm running into a case where angr just errors out whenever I give it a constraint on a global. To test this, I created a simple xor program that I theoretically should be able to use angr to decrypt.
#include <stdio.h>
#include <string.h>
unsigned char *key = "\xa5\x57\x03\x4a\xa4\xdc\xff\x7c\xfa\xf5\x1f\x52\x79\x6a\x61\xcb";
unsigned char inbuf[256];
unsigned char outbuf[256];
void xor_encrypt()//char *key, char *string)
{
int i, string_length = strlen(inbuf);
for(i=0; i<string_length; i++)
{
outbuf[i] = inbuf[i] ^ key[i%16];
}
}
int main() {
int i;
printf("Input To Hash: ");
fgets(inbuf,256,stdin);
size_t ln = strlen(inbuf) - 1;
if (inbuf[ln] == '\n')
inbuf[ln] = '\0';
xor_encrypt();//key,buf);
for (i=0; i < ln; i++) {
printf("%.2x ",outbuf[i]);
}
printf("\n");
return 0;
}
Then I use the following commands for angr. If I add the constraints before asking it to explore, it just fails at the exploration portion:
import angr
import simuvex
import logging
b = angr.Project("a.out",use_sim_procedures=True)
# Get the addresses
inbufAddr = b.loader.main_bin.get_symbol("inbuf").addr
outbufAddr = b.loader.main_bin.get_symbol("outbuf").addr
e = b.surveyors.Explorer(find=(0x400724))
e.run()
s = e.found[0].state
outbuf = s.memory.load(outbufAddr,36)
inbuf = s.memory.load(inbufAddr,36)
s.add_constraints(outbuf[7:0] != 0)
s.se.any_str(inbuf)
from angr-doc.
One thing that I suspect is happening is that, e.run()
returned the very first path it found, which happened to be the one with strlen(...) == 0
.
Can you check more paths, other than only the first path returned?
from angr-doc.
It says it has 2 active paths as well. Is there an easy way to keep running those?
from angr-doc.
There's a parameter num_find
on the Explorer constructor.
from angr-doc.
e = b.surveyors.Explorer(find=(0x400724),num_find=3)
e.run()
<Explorer with paths: 0 active, 0 spilled, 1 deadended, 2 errored, 0 unconstrained, 1 found, 0 avoided, 0 deviating, 0 looping, 0 lost>
e.errored[0]
<Errored Path with 31 runs (at 0x400674, ClaripyOperationError)>
e.errored[0].error
ClaripyOperationError("can't reverse non-byte sized bitvectors",)
from angr-doc.
well that didn't format right, but the other paths errored. still only one found path
from angr-doc.
Fuck. We fixed a bug just like that yesterday. Let me make sure everything is clean to push.
from angr-doc.
I installed from pip a few days back. maybe i need to pull from git?
from angr-doc.
@rhelmot You might want to wait for my VSA_DDG test to pass...
from angr-doc.
Here's some more of the error:
In [12]: e.run()
WARNING:simuvex.vex.irsb:<SimIRSB 0x40070d> hit an when analyzing statements
Traceback (most recent call last):
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/irsb.py", line 92, in _handle_irsb
self._handle_statements()
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/irsb.py", line 207, in _handle_statements
s_stmt = translate_stmt(self.irsb, stmt_idx, self.last_imark, self.state)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/statements/init.py", line 31, in translate_stmt
s.process()
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/statements/base.py", line 26, in process
self._execute()
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/statements/store.py", line 34, in _execute
self.state.memory.store(addr.expr, data_endianness, action=a)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/storage/memory.py", line 163, in store
self._store(request)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/plugins/symbolic_memory.py", line 469, in _store
req.actual_addresses = self.concretize_write_addr(req.addr)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/plugins/symbolic_memory.py", line 202, in concretize_write_addr
return self._concretize_addr(addr, strategy=strategy, limit=limit)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/plugins/symbolic_memory.py", line 168, in _concretize_addr
raise SimMemoryAddressError("Trying to concretize with unsat constraints.")
SimMemoryAddressError: Trying to concretize with unsat constraints.
from angr-doc.
This is a non-error. It just means the path is unsatisfiable and should be thrown out. The actual analysis is still running while this prints out; it's just part of a logger message.
from angr-doc.
Oh, and: we updated the versions of everything on pip last night! Upgrade angr and see if you still get the bitvector-reversing errors.
from angr-doc.
Nice. The errors are gone. Though for whatever reason it still fails with:
s.se.any_str(inbuf)
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/simuvex/plugins/solver.py", line 269, in any_str
return self.any_n_str(e, 1, extra_constraints=extra_constraints)[0]
IndexError: list index out of range
I think there should be a patch for that part to at least return a message to the user saying that there were no solutions. That said, am I asking too much of angr/symbolic execution to solve the xor as described above?
from angr-doc.
The problem with the code you provided is that you only check the first found path on the explorer. The only thing affecting the length of a path in your program is how many times it goes around the xor_encrypt
loop, and going around the loop zero times corresponds to the case where the user entered an empty string. Therefore, the path to which you add the constraint has never had any data written to outbuf
, so the first char in outbuf
can never be anything but zero, which is why the state goes unsat after you add your constraint.
from angr-doc.
I do agree with you though, there should be a better error message there.
from angr-doc.
I really must be missing something here then. I do the num_find=5 for instance, and use s = e.found[4].state. Even with that, adding that one constraint causes it to fail. I've even tried adding the following at the beginning with no luck in getting the constraints solved:
def stringlen(state):
state.regs.rax = 16
# Hook the strlen on xor to define the input size
b.hook(0x400663,stringlen,length=5)
If I understand it correctly, that should go to the hook once it gets to the strlen check in the xor function, and instead of checking it will just return that the length is 16. Using that hook still causes it to fail to solve the constraint.
from angr-doc.
Ah, I see. The problem now is that by default, memory.load
loads big-endian values, which is intentional so that you can load strings. outbuf[7:0]
pulls out the eight least significant bits in outbuf, which in a big-endian load, correspond to the byte in memory with the largest address. If I allow the loop to run 36 times, I can satisfiably set the constraint.
import angr
import simuvex
import logging
b = angr.Project("a.out",use_sim_procedures=True)
# Get the addresses
inbufAddr = b.loader.main_bin.get_symbol("inbuf").addr
outbufAddr = b.loader.main_bin.get_symbol("outbuf").addr
e = b.surveyors.Explorer(find=(0x4006F5), num_find=50)
x = 0
for found in e:
x += 1
print 'found', x
s = found.state
outbuf = s.memory.load(outbufAddr,36)
inbuf = s.memory.load(inbufAddr,36)
s.add_constraints(outbuf[7:0] != 0)
if s.satisfiable():
print repr(s.se.any_str(inbuf))
break
import IPython; IPython.embed()
Prints found 1 found 2 ... found 37
and then drops into the shell.
from angr-doc.
Thanks! Got it to work. That was seriously driving me nuts. Btw, is there a more elegant way to set the constraints than the following?
i = 0
for c in "".join("c3 3b 62 2d df 8f 86 11 98 9a 73 3b 1a 35 04 b3 c0 34 76 3e cd b3 91 23 9c 9a 6d 0d 0d 02 04 94 f2 1e 4d 37".split(" ")).decode('hex')[::-1]:
s.add_constraints(outbuf[i + 7:i] == ord(c))
i += 8
It works, but seems a little janky.
from angr-doc.
number = int('c3 3b 62 2d df 8f 86 11 98 9a 73 3b 1a 35 04 b3 c0 34 76 3e cd b3 91 23 9c 9a 6d 0d 0d 02 04 94 f2 1e 4d 37'.replace(' ', ''), 16)
s.add_constraints(outbuf == number)
from angr-doc.
Do you mean int
instead of hex
?
On Sep 10, 2015 8:07 PM, "Andrew Dutcher" [email protected] wrote:
number = hex('c3 3b 62 2d df 8f 86 11 98 9a 73 3b 1a 35 04 b3 c0 34 76 3e cd b3 91 23 9c 9a 6d 0d 0d 02 04 94 f2 1e 4d 37'.replace(' ', ''), 16)
s.add_constraints(outbuf == number)—
Reply to this email directly or view it on GitHub
#7 (comment).
from angr-doc.
I edited that post within four seconds of posting it, snap!
from angr-doc.
Ah, poop, I was on email :-)
On Sep 10, 2015 8:10 PM, "Andrew Dutcher" [email protected] wrote:
I edited that post within four seconds of posting it, snap!
—
Reply to this email directly or view it on GitHub
#7 (comment).
from angr-doc.
Ah. I tried doing the same thing as string itself. Converting it into just a really large number and setting it in one shot (outbuf == number) worked perfect. Thanks!
from angr-doc.
I've seen you guys use both Explorer and Path Groups. Is one better than the other?
from angr-doc.
You should try to use PathGroup
instead of Explorer
. PathGroup
is more flexible. Explorer
belongs to the past!
from angr-doc.
Also, how does Angr decide when it has hit a deadend? I'm playing around with a CTF challenge that Angr returns claiming all it was able to find were a bunch of deadends and a bunch of avoided paths.
from angr-doc.
A path deadends when there is no feasible successor can be found.
from angr-doc.
Related Issues (20)
- Unexpected behaviour between different versions while analyzing "beginner" binary HOT 1
- [help] why no solutions?
- why input length must multiply 4 in examples/b01lersctf2020_little_engine HOT 1
- Question: BVS, bytes, ASCII, constraints HOT 3
- Resolve automatically HOT 15
- Remove references to Layer7 and other Surveyor solves HOT 1
- CFG Emulated "None type" Node HOT 2
- Swapped find and avoid on sim_mgr.explore when using argv claripy HOT 1
- where can i find the source code of the folder "example"? HOT 1
- little_engine example not working for me HOT 9
- Error/inconsistency handling arm code between angr versions
- Add concatenating constraints to cheatsheet HOT 2
- insomnihack fail to find a symbolic buffer HOT 2
- `test_apidoc.test_lint_docstrings` fails under python 3.8 HOT 5
- `test_examples.test_defcon2016quals_baby_re` is timing out in CI HOT 3
- Testing the java_androidnative1 example failed with error HOT 3
- Move API docs to project repos. HOT 5
- Install information is out of date and sometimes incorrect HOT 1
- Migrate gitbook docs to api docs HOT 1
- driller's approach page cannot find HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from angr-doc.