Git Product home page Git Product logo

falcon's Introduction

falconre

Falconre is a python3 library using pyo3 to wrap:

  • Falcon - A binary analysis framework in Rust
  • Finch - A symbolic executor built on Falcon
  • Raptor - Higher order IR and analysis on top of Falcon

This is alpha-quality software.

What is this really?

This is me (endeav0r) hacking on Falcon and other things to try and automate different simple static analysis tasks.

Here are some things you can try:

Print out all the function calls in a program

git clone https://github.com/endeav0r/corpora
git clone https://github.com/falconre/falconre

# Build some example programs
pushd corpora
./build.sh
popd

# Build falconre in a Docker container
pushd falconre
docker build -t falconre .
# Watch youtube, this will take a few minutes
popd

docker run --rm -ti -v $(pwd):/opt falconre \
  python3 /opt/falconre/examples/print-calls.py \
  /opt/corpora/build/stack_buffer/vuln/one

Find trivial stack buffer overflows

# Run the example stack-writes.py script against a trivially vulnerable stack
# buffer overflow program.
docker run --rm -ti -v $(pwd):/opt falconre \
  python3 /opt/falconre/examples/stack-writes.py \
  /opt/corpora/build/stack_buffer/vuln/one

# Run the example stack-writes.py against a non-vulnerable version of the same
# program.
docker run --rm -ti -v $(pwd):/opt falconre \
  python3 /opt/falconre/examples/stack-writes.py \
  /opt/corpora/build/stack_buffer/not_vuln/one

Symbolically execute a toy amd64 assembly function

docker run --rm -ti -v $(pwd):/opt falconre \
  python3 /opt/falconre/examples/symex-one.py \
  /opt/corpora/build/symex/one

Print out DOT graph of Falcon IL

docker run --rm -ti -v $(pwd):/opt falconre \
  python3 /opt/falconre/examples/falcon-dot-graph.py \
  /opt/corpora/build/symex/one run | dot -Tpng -o /tmp/main.png

Installing

Docker

docker build -t falconre .

Natively on OSX

This is how I use falconre.

Install rust:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

You'll need this thing called "Rust nightly":

rustup toolchain install nightly
rustup default nightly

Get the dependencies

brew install z3 capstone

Install with setuptools

python3 setup.py install

If you don't want to install with setuptools, assuming you want to run the examples:

cargo build --release
cp target/release/libfalconre.dylib examples/falconre.so

falcon's People

Contributors

bstrie avatar cgfandia avatar emmanuel099 avatar endeav0r avatar jeandudey avatar m4b avatar oblivia-simplex avatar srikwit avatar tnballo avatar turnersr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

falcon's Issues

Eliminate error-chain

Falcon needs a good-and-proper Error enum. This will touch almost all of the codebase.

This is a very heavy lift, and is being forever/continually pushed to the right.

Amd64 Lifter

We already have 32-bit x86 over Capstone. Port that lifter to 64-bit x86.

Complete Documentation

In-line rust documentation for:

  • falcon::engine
  • falcon::executor
  • falcon::il
  • falcon::loader
  • falcon::platform
  • falcon::translator

Elf Linker for MIPS

Need the equivalent of the x86 ElfLinker for MIPS binaries. The linking process for ELF binaries can be cleaned up in general.

Handle values >64 bits

Currently il::Constant only handles values up to 64-bit. This 64-bit restriction may be enforced elsewhere through the code base. Falcon should handles values up to an arbitrary number of bits.

Feedback on "The Il Nop" blog post

Re: http://reversing.io/posts/the-il-nop/

This is about the most insightful write-up on different IRs I ever read. After ramblings of humble me of course ;-).

I can't claim that I actually used many different IRs, but I've had enough of LLVM IR and just looks at the rest of crowd were enough too. When time has come to wave LLVM IR bye-bye, I figured that readability and familiarity should be the properties of the utmost importance for the IR (all the rest of properties like expressiveness and preciseness of semantics are obvious). That way came PseudoC: https://github.com/pfalcon/ScratchABlock/blob/master/docs/PseudoC-spec.md

And Falcon IR gets quite a section in an IR zoo I maintain: https://github.com/pfalcon/ScratchABlock/blob/master/docs/ir-why-not.md

how to run?

Wanted to try this out (nice work btw!)

Do you have a simple cli program that takes a binary as input and outputs whatever?

If not, I'd suggest adding a simple prototype/reference program in lib/bin/main.rs so could be easy to try out, or perhaps something in examples/

Keep up the good work!

Rc symbolic memory pages

Symbolic memory should be paged, with reference counted pages. Forking state will call clone() over these pages, and be exceptionally fast. Writing to a page will call Rc::make_mut, producing a copy at the time of the write.

This should greatly reduce symbolic fork times, and greatly reduce memory usage.

Handle RDTSC

rdtsc instruction is currently translated to a nop. We may want to handle this at some point in the future.

il::control_flow_graph::merge is slow

il::control_flow_graph::merge merges adjacent blocks together. It's primarily used immediately after a function is translated. It's slow because of poor logic. Make logic not poor.

Translator::translate_function_extended leads to Index does not exist for set entry

When retrieving the Loader::program of cyberblogger from https://github.com/trailofbits/cb-multios (compiled with clang 8.0.1) an error Err("Index does not exist for set_entry") is returned from the line control_flow_graph.set_entry(block_indices[&function_address].0)?; at the end of translate_function_extended. This occurs because for the function at address 0x45008 returns from the translate_block call in the x86 decoder immediately due to a CS_ERR_OK. This leaves the translation block with no instructions which results in a bogus insertion to the block_indices at block_indices.insert(*result.0, (block_entry, block_exit)); in translate_function_extended because block entry is set to 0 and block exit is set to 0 as nothing was inserted into the overall Function CFG and the block_entry variable was never set.

My hacky fix for an empty translation block is up at https://github.com/2over12/falcon

I simply check if there are no instructions for a block and add a new empty CFG to the block if so. There probably should not be an empty function though, it is currently unclear to me why falcon_capstone immediately returns a CS_ERR_OK, but it also probably worth handling an empty function in some sort of sane way.

PE Loader

Need the ability to load PE binaries

ARM Translator Implementation

This project is very, very, relevant to my interests.

I see from the blog post you are using Capstone as the disassembler, which implements the ARM architecture. I'd like to take a stab at an implementation of ARM - at least for ARMv6 in Thumb mode (my current target).

I will use this issue to track an attempt at adding a module in falcon/lib/translator to map the Capstone ARM instruction API to Falcon IL.

If you've thought of this already and have ideas of where it might go wrong please chime in! I'm just familiarizing myself with the codebase now.

il::Constant is slow with num_bigint

The num_bigint crate is slow. Due to requirements in handling amd64 instructions, Falcon moved to a big integer library to support operands > 64 bits in width. This will also be required for SSE/AVX instructions in the future.

However, because il::Constant is now only backed by num_bigint::BigUint, this incurs unacceptable slowdowns during operations such as lifting entire binaries. il::Constant requires more sophisticated logic to back operations over faster u64-native operations when appropriate.

falcon::translator::Arch::translate_function should take Memory trait

falcon::translator::Arch::translate_function should take some sort of trait for Memory. This will allow various backing for memory to be passed to translate_function. Specifically, we can implement this trait for symbolic memory, and have executors concretize and create new functions on the fly.

docs.rs/falcon does not build

This is falcon's tracking issue for our docs.rs documentation failing to build.

docs.rs issue is here: rust-lang/docs.rs#1351

If docs.rs doesn't work, we'll host the docs elsewhere. However, my first choice is make docs.rs work.

0.5 IL Changes

This issue exists as a place to discuss IL changes for Falcon 0.5.

  • Constant - No change
  • Scalar
    • Add Optional SSA parameter (already implemented)
  • Expression - No change
  • Operation
    • Add il::Operation::Conditional(Box<il::Operation>), which allows for a conditionally executed instruction.
  • Instruction - No Change
  • Block - No Change
  • Edge - No Change
  • ControlFlowGraph - No Change
  • Function - No Change
  • Program - No Change

Other Ideas:

  • Delay Slots.
  • Parallel Executed Instructions (Execution Packets).
  • Placeholder

Delay Slots

We can create an operation, il::Operation::Delay(usize, Box<Operation>), but allowing for operations with arbitrary delays makes the implementation of analyses and control-flow recovery more difficult. We would need to create some sort of, "Executor," which could be used by analyses, which kept track of operations in delay slots/pipelines.

Parallel Execution

I am less sure how to incorporate this. Currently, il::Instruction corresponds nicely to a single instruction. In some architectures, however, multiple instructions can be executed simultaneously. We can either lift these addresses to an il::Instruction, and at the il::Instruction level mark whether the instruction is parallel or not, or we can create an il::Operation::Parallel(Vec<il::Operation>). I'm again worried about creating an il::Operation::Parallel(Vec<il::Operation>), because it may make implementing analyses more challenging. We would need to integrate this into an, "Executor," of sorts which managed all of this for us.

Placeholder

Often times we Nop out instructions we aren't concerned with. Specific to this conversation is the NOPing of branch instructions. People would like to retain this information in an optional fashion.

There are a couple ways to do this.

  1. We create a Placeholder Operation, which is another operation people need to consider while doing analyses.
  2. We create a placeholder: Option<il::Operation> field for Nops. This becomes a, "Bonus add-on," modification that should not affect anything as is.
  3. We allow the attachment of serialized data different components of the il, beginning with Program, Function, Block, Instruction, and Operation. We can go with either json or I am going to recommend, bincode. Json is nice because it allows the entire IL to still be easily serializable. Bincode is nice because anything that is Rust can be encoded with bincode (json only allows things which can be converted to strings as map indices).

Use serde for Contants

Allow deserializing and serializing results of constants analysis with Serde. This will allow analysis to be conducted once and then saved.

[BUG] segfault in capstone disasm

Program received signal SIGSEGV, Segmentation fault.
0x000055555589f221 in falcon_capstone::capstone::Instr::new () at src/capstone.rs:192
192	            let detail = unsafe { *instr.detail };
(gdb) bt
#0  0x000055555589f221 in falcon_capstone::capstone::Instr::new ()
    at src/capstone.rs:192
#1  falcon_capstone::capstone::InstrBuf::get () at src/capstone.rs:395
#2  0x0000555555858ea7 in falcon::translator::x86::translator::translate_block
    () at lib/translator/x86/translator.rs:90
#3  0x0000555555831f7f in falcon::translator::x86::{impl#3}::translate_block ()
    at lib/translator/x86/mod.rs:54
#4  falcon::translator::Translator::translate_function_extended<falcon::translator::x86::Amd64> () at lib/translator/mod.rs:163
#5  0x0000555555784c49 in falcon::loader::Loader::program_verbose<falcon::loader::elf::elf_linker::ElfLinker> ()
    at /home/godtex/.cargo/registry/src/index.crates.io-6f17d22bba15001f/falcon-0.5.5/lib/loader/mod.rs:150
#6  falcon::loader::Loader::program_recursive_verbose<falcon::loader::elf::elf_linker::ElfLinker> ()
    at /home/godtex/.cargo/registry/src/index.crates.io-6f17d22bba15001f/falcon-0.5.5/lib/loader/mod.rs:198
#7  falcon::loader::Loader::program_recursive<falcon::loader::elf::elf_linker::ElfLinker> ()
    at /home/godtex/.cargo/registry/src/index.crates.io-6f17d22bba15001f/falcon-0.5.5/lib/loader/mod.rs:169

towards a rust reversers datastructures crate

Hello!

So I've been reading project, really great work and I'm really excited for some of the stuff you're doing, can't wait to see more, no matter what gets decided!

On that note, as you guessed from the title, I'm hoping it might be possible to consolidate 0-N things from this, panopticon, and a theoretical new memory interval crate that I want to write, as well as some other things.

This is a huge, huge topic, and I likely won't hit on a lot of the points, but just getting the ball rolling is good I think, if only to see if you're interested, where you're headed with things, etc.

If you're not interested at all, that is totally fine of course :) Just wanted to see what you think

Generic IL/Function crate

So, for starters (and probably most controversially), reading through your source, particularly the il module, there is so much that I think could be refactored (along with panopticon) into a generic function/il rust crate.

I say controversial because it will likely be hard/tedious, but i do think it would be (extremely) beneficial.

It would also require probably the deepest amount of coordination, which could be hard.

Nevertheless, I think some prime candidates are the il, and the function objects. If we could somehow make Function<IL>, where IL is the intermediate language used, this could have really really cool benefits.

  1. It would allow all of us to try out each other's IR
  2. it would allow us to switch to another IR for a different task (perhaps one is more compact, hence faster analysis)
  3. It would allow us to reuse the function logic and definitions and methods, consolidating bugs and developer effort into a single location (this and 2 are my prime motivation)

Its hard for me to state how great this could be if we were able to swap out IL's at will. It also just seems right from an engineering perspective, similar to backends on a compiler.

As it stands now in both codebases, I think this modification is almost trivially possible - except - the disassembler aspect.

But this isn't necessarily bad news!

For almost the exact reasons in 1-3, i think it would be really cool to allow function (or whatever it ends up being) to also be generic in the disassembler, allowing a more robust disassembler implementation (like capstone), or a home grown solution like panopticon, etc.

Again the benefits here are experimentation, can try different assembler for different IR backend, etc.

Doing this I think will require sketching out what a generic function + a generic disassembler would look like, and what would be the most flexible, and hence requires the most cooperation and assessment of current codebases dependencies and expectations etc., but long term I think it would be really cool, it would allow all our work to be pooled together and hence we'd all benefit.

While I think this will be the hardest part to refactor, coordinate and get right, I think it will actually have the most benefit; of course, this is just my opinion though :)

An interval tree crate, with a second crate geared towards binary memory intervals

I don't think this is controversial at all, and I think it would be invaluable. I want something like this already for bingrep, panopticon needs it, and i'm sure falcon could use it too.

Basically the idea is a:

[x..y) -> Value

Which is a datastructure that's created after the parser pass (or whenever you want, as long as you can send it a goblin binary), and which initially gets filled up with segment/section data; which ranges, what the name of the segment is, and perhaps what "kind of data" is there. We'd figure out what we want for a segment datatype, what information we'd need, etc. And of course, if its a central crate, when we need something new, we just extend it and everyone gets the benefit.

Similarly, and this would be the tricky part and where I want feedback, downstream users could also extend the memory ranges with their own tagging data, like [0xbeef..0xdead) -> FunctionRange, etc.

Even if some fancy runtime extendable type doesn't work out, even if we just agree on an enum in this crate which downstream clients use, I think this would be great code reuse and benefit everyone all around.

dynamic linker/runtime loader

Your loader looks really awesome!!! So i've been trying to get other persons to help create a relocator crate for a while, but no one is really interested in this stuff ๐Ÿ˜†

Anyway, at some future data I think panopticon wants to have this. So I've wanted to turn https://github.com/m4b/dryad into a library for quite some time. Basically I like working on that project and I'll find any excuse; also all that code going to waste would be sad.

So I'd like to propose potentially fusing falcon's runtime loader here with dryad, or vice versa, perhaps dryad becomes a lib, or i rip out parts of it via copy paste, whatever, and then that crate is refactored to be a library which downstream consumers like falcon and panopticon (and whoever really, who knows the applications!) can use it as their runtime linking and loading system.

Initial issue is i'll have to put the asm usage in dryad and bare functions behind feature flags, as it requires nightly, and its not nice to force that on downstream clients (which would be sad, since it's a pure rust toolchain dynamic linker that way!)

More things haven't even thought of

Anyway, that's my suggestion for 3 different things I think are candidate usecases to refactor out into shared dependencies for great good. I'm sure there are many other opportunities as well.

Let me know what you're thinking; as you can tell, I'm of the persuasion we should combine all of our powers and take over the universe ๐Ÿ‘ผ

Thanks for reading this far, I know, it was a lot :)

/cc @flanfly

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.