Git Product home page Git Product logo

python-idb's Introduction

python-idb

python-idb is a library for accessing the contents of IDA Pro databases (.idb files). It provides read-only access to internal structures such as the B-tree (ID0 section), name address index (NAM section), and flags index (ID2 section). The library also provides analysis of B-tree entries to expose logical structures like functions, cross references, bytes, and disassembly (via Capstone). An example use for python-idb might be to run IDA scripts in a pure-Python environment.

Willem Hengeveld (mailto:[email protected]) provided the initial research into the low-level structures in his projects pyidbutil and idbutil. Willem deserves substantial credit for reversing the .idb file format and publishing his results online. This project heavily borrows from his knowledge, though there is little code overlap.

example use:

example: list function names

In this example, we list the effective addresses and names of functions:

In [4]: import idb
   ...: with idb.from_file('./data/kernel32/kernel32.idb') as db:
   ...:     api = idb.IDAPython(db)
   ...:     for ea in api.idautils.Functions():
   ...:         print('%x: %s' % (ea, api.idc.GetFunctionName(ea)))

Out [4]: 68901010: GetStartupInfoA
   ....: 689011df: Sleep
   ....: 68901200: MulDiv
   ....: 68901320: SwitchToFiber
   ....: 6890142c: GetTickCount
   ....: 6890143a: ReleaseMutex
   ....: 68901445: WaitForSingleObject
   ....: 68901450: GetCurrentThreadId
        ...

Note that we create an emulated instance of the IDAPython scripting interface, and use this to invoke idc and idautils routines to fetch data.

example: run an existing IDAPython script

In this example, we run the yara_fn.py IDAPython script to generate a YARA rule for the function at effective address 0x68901695 in kernel32.idb:

asciicast

The target script yara_fn.py has only been slightly modified:

  • to make it Python 3.x compatible, and
  • to use the modern IDAPython modules, such as ida_bytes.GetManyBytes rather than idc.GetManyBytes.

what works

  • 50 unit tests that demonstrate functionality including file format, B-tree, analysis, and idaapi features.
  • read-only parsing of .idb files from IDA Pro v6.95
    • extraction of file sections
    • B-tree lookups and queries (ID0 section)
    • flag enumeration (ID1 section)
    • named address listing (NAM section)
  • analysis of artifacts that reconstructs logical elements, including:
    • root metadata
    • loader metadata
    • entry points
    • functions
    • structures
    • cross references
    • fixups
    • segments
  • partial implementation of the IDAPython API, including:
    • Names
    • Heads
    • Segs
    • GetMnem (via Capstone)
    • Functions
    • FlowChart (basic blocks)
    • lots and lots of flags
  • Python 3.x compatibility

what doesn't quite work

support for the following features are feasible and planned, but not yet implemented:

  • compressed databases
  • .i64 files
  • Python 2.7 compatibility
  • performance (there's no caching yet for code clarity)
  • databases from versions other than v6.95
  • parsing TIL section

what will never work

  • write access

getting started

python-idb is a pure-Python (currently: 3.x) library, with the exception of Capstone (required only when calling disassembly APIs). You can install it via pip or setup.py install, both of which should handle depedency resolution:

 $ cd ~/Downloads/python-idb/
 $ python setup.py install
 $ python scripts/run_ida_script.py  ~/tools/yara_fn.py  ~/Downloads/kernel32.idb
   ... profit! ...

While most python-idb function have meaningful docstrings, there is not yet a comprehensive documentation website. However, the unit tests demonstrate functionality that you'll probably find useful.

Someone interested in learning the file format and contributing to the project should review the idb.fileformat module & tests. Those that are looking to extract meaningful information from existing .idb files probably should look at the idb.analysis and idb.idapython modules & tests.

Please report issues or feature requests through Github's bug tracker associated with the project.

license

python-idb is licensed under the Apache License, Version 2.0. This means it is freely available for use and modification in a personal and professional capacity.

python-idb's People

Contributors

williballenthin avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.