Git Product home page Git Product logo

microx's Introduction

microx - a micro execution framework

CI

Microx is a single-instruction "micro execution" framework. Microx enables a program to safely execute an arbitrary x86 or x86-64 instruction. Microx does not take over or require a process context in order to execute an instruction. It is easily embedded within other programs, as exampled by the Python bindings.

The microx approach to safe instruction execution of arbitrary instructions is to require the user of microx to manage machine state. Microx is packaged as a C++ Executor class that must be extended. The Python bindings also present a class, microx.Executor, that must be extended. A program extending this class must implement methods such as read_register and read_memory. When supplied with instruction bytes, microx will invoke the class methods in order to pull in the minimal requisite machine state to execute the instruction. After executing the instruction, microx will "report back" the state changes induced by the instruction's execution, again via methods like write_register and write_memory.

The following lists some use-cases of microx:

  • Speculative execution of code within a debugger-like system. In this scenario, microx can be used to execute instructions from the process being debugged, in such a way that the memory and state of the original program will be preserved.
  • Binary symbolic execution. In this scenario, which was the original use-case of microx, a binary symbolic executor can use microx to safely execute an instruction that is not supported or modelled by the symbolic execution system. The use of microx will minimize the amount of symbolic state that may need to be concretized in order to execute the instruction. Microx was used in this fashion in a Python-based binary symbolic executor. Microx comes with Python bindings for this reason.
  • Headless taint tracking. Taint tracking can be implemented with microx, much as it would be with Intel's PIN, but without a process context. Microx can be integrated into a disassembler such as IDA or Binary Ninja and used to execute instruction, performing taint tracking along the way.

Microx uses a combination of JIT-based dynamic binary translation and instruction emulation in order to safely execute x86 instructions. It is a 64-bit library, but it can execute 32-bit instructions that are not supported on 64-bit platforms. It can be easily embedded, as it performs no dynamic memory allocations, and is re-entrant.

Microx depends on Intel's XED instruction encoder and decoder.

Installing

Microx has Python bindings; you can install them via pip on macOS and Linux:

$ python -m pip install microx

Building (Python)

If we don't supply a Python wheel for your platform, you can build microx yourself. You'll need at least Python 3.7.

First, build XED:

$ ./scripts/bootstrap.sh

Then, use setup.py build:

$ setup.py build

Building (C++)

Microx's C++ library can be built with CMake.

The CMake build uses XED_DIR to locate the XED library and headers.

To use the third_party XED build:

$ ./scripts/bootstrap.sh
$ export XED_DIR=$(pwd)/third_party

Then, run a normal CMake build:

mkdir build && cd build
cmake ..
cmake --build .

microx's People

Contributors

artemdinaburg avatar awesie avatar dependabot[bot] avatar dguido avatar ekilmer avatar mrexodia avatar pasztorpisti avatar pgoodman avatar woodruffw avatar xlauko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

microx's Issues

REP string instructions are broken

REP MOVSB was not working correctly for me. It seems the source and destination registers are not in the modified register set and do not get updated by WriteRegisters.

Something like this fixes the bug for me:

@@ -802,6 +802,7 @@ static void UpdateFlagsSub(Flags &flags, uintptr_t lhs, uintptr_t rhs,
     WriteGPR(dest_reg,                                                       \
              static_cast<uint64_t>(static_cast<int64_t>(ReadGPR(dest_reg)) + \
                                    stringop_inc));                           \
+    gModifiedRegs.set(dest_reg);                                             \
   } while (false)

 #define SCAS                                                                 \
@@ -813,6 +814,7 @@ static void UpdateFlagsSub(Flags &flags, uintptr_t lhs, uintptr_t rhs,
     WriteGPR(dest_reg,                                                       \
              static_cast<uint64_t>(static_cast<int64_t>(ReadGPR(dest_reg)) + \
                                    stringop_inc));                           \
+    gModifiedRegs.set(dest_reg);                                             \
   } while (false)

 #define LODS                                                                 \
@@ -821,6 +823,7 @@ static void UpdateFlagsSub(Flags &flags, uintptr_t lhs, uintptr_t rhs,
     WriteGPR(dest_reg,                                                       \
              static_cast<uint64_t>(static_cast<int64_t>(ReadGPR(dest_reg)) + \
                                    stringop_inc));                           \
+    gModifiedRegs.set(dest_reg);                                             \
   } while (false)

 #define MOVS                                                                 \
@@ -832,6 +835,8 @@ static void UpdateFlagsSub(Flags &flags, uintptr_t lhs, uintptr_t rhs,
     WriteGPR(dest_reg,                                                       \
              static_cast<uint64_t>(static_cast<int64_t>(ReadGPR(dest_reg)) + \
                                    stringop_inc));                           \
+    gModifiedRegs.set(src_reg);                                              \
+    gModifiedRegs.set(dest_reg);                                             \
   } while (false)

 #define CMPS                                                                 \
@@ -845,6 +850,8 @@ static void UpdateFlagsSub(Flags &flags, uintptr_t lhs, uintptr_t rhs,
     WriteGPR(dest_reg,                                                       \
              static_cast<uint64_t>(static_cast<int64_t>(ReadGPR(dest_reg)) + \
                                    stringop_inc));                           \
+    gModifiedRegs.set(src_reg);                                              \
+    gModifiedRegs.set(dest_reg);                                             \
   } while (false)

 #define REPNE(...)                \

Python example with 64-bit RIP

Setting the memory map address bit size to 64 is not sufficient to run X64 code. It will constantly loop over one instruction because when microx updates the program counter it updates EIP instead of RIP.

Bug in pop rsp

pop rsp currently returns result increased by 8 opposed to expected value. One possible explanation is that following happens:

new = *rsp
store(dst, new)
*rsp = *rsp + 8

instead of

new = *rsp
*rsp = *rsp + 8
store(dst, new)

CI builds

We should build both microx and its Python bindings in the CI.

DF not handled properly

DF is not having a correct value after executing some instruction. Best seen with popf and can crash program in other cases (by possibly not restoring properly?)

Publish to PyPI

We should publish microx to PyPI. Building it as a redistributable wheel shouldn't be too difficult, as long as we link XED statically.

Missing support for RDTSC / RDTSCP

While these are categorized as "system" instructions, they are also popular within userspace applications. While it is unlikely that microx can provide a "correct" implementation, it can at least provide a way for the user to provide an implementation.

Something like:

@@ -1347,6 +1421,18 @@ static bool Emulate(const Executor *executor, uintptr_t &next_pc,
       WriteGPR(reg0, mem0);
       return true;

+    case XED_IFORM_RDTSC:
+    case XED_IFORM_RDTSCP: {
+        Data tsc;
+        if (!executor->ReadTsc(tsc)) {
+          status = ExecutorStatus::kErrorReadTsc;
+          return false;
+        }
+        WriteGPR(reg0, *reinterpret_cast<uint32_t *>(&tsc.bytes[0]));
+        WriteGPR(reg1, *reinterpret_cast<uint32_t *>(&tsc.bytes[4]));
+      }
+      return true;
+
     default:
       return false;
   }
@@ -1382,6 +1468,14 @@ static bool UsesUnsupportedAttributes(void) {

 static bool UsesUnsupportedFeatures(const Executor *executor) {
   switch (xed_decoded_inst_get_category(gXedd)) {
+    case XED_CATEGORY_SYSTEM:
+      switch (xed_decoded_inst_get_iform_enum(gXedd)) {
+        case XED_IFORM_RDTSC:
+        case XED_IFORM_RDTSCP:
+          return false;
+        default:
+          return true;
+      }
     case XED_CATEGORY_3DNOW:
     case XED_CATEGORY_MPX:
     case XED_CATEGORY_AES:

Saving/restoring XMM / YMM register state in inline assembly

I don't recall if we're actually saving/restoring enough. I think we're only saving xmm0 through xmm7. I'd need to think more about whether more is needed. I don't believe that we rewrite things like raw uses of xmm8 (in avx2) to be in that range, for instance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.