Git Product home page Git Product logo

fidl's People

Contributors

carlosgprado avatar itayc0hen avatar marc-etienne avatar stevemk14ebr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fidl's Issues

Simplify the API and add a helper function to get all xrefs-to a function

As far as I understand, currently, if one wants to get all the calls to a certain function, they have two options:

  1. display_all_calls_to (function)
    https://github.com/fireeye/FIDL/blob/6b127946b704d4e5f027c48cdb02cbdbef4d8890/FIDL/decompiler_utils.py#L1302-L1307

  2. find_all_calls_to (f_name, ea)
    https://github.com/fireeye/FIDL/blob/6b127946b704d4e5f027c48cdb02cbdbef4d8890/FIDL/decompiler_utils.py#L2250-L2255

While the display_all_calls_to prints all the calls to a function globally (across the entire binary), the latter, find_all_calls_to, returns only the xrefs to a function from a specific function (which ea belongs to`).

The problem is that the name find_all_calls_to is confusing due to the localization of the search-range (to a specific function), and that there is no option to get a list of callObj for all the calls to a function, globally.

I suggest having three functions:

  1. display_all_calls_to (function) - keep as is
  2. find_all_calls_to_from_function (f_name, ea) will behave as the current find_all_calls_to
  3. find_all_calls_to (function) - will return a list of callObj for all the xrefs, globally

Is there not a way to find the ITP anchor for a given line?

Referring to this code, used to add a comment:

https://github.com/fireeye/FIDL/blob/e6ceb000cda43b450717eb171309c02dee06dd4f/FIDL/decompiler_utils.py#L2126-L2135

When I saw this, I thought to myself, surely there is a better way!

According to the IDA CPP header,

 /// Invisible COLOR_ADDR tags in the output text are used to refer to ctree items and variables
 struct ctree_anchor_t
 {
    uval_t value;
    #define ANCHOR_INDEX 0x1FFFFFFF
    #define ANCHOR_MASK 0xC0000000
    #define ANCHOR_CITEM 0x00000000 ///< c-tree item
    #define ANCHOR_LVAR 0x40000000 ///< declaration of local variable
    #define ANCHOR_ITP 0x80000000 ///< item type preciser
    #define ANCHOR_BLKCMT 0x20000000 ///< block comment (for ctree items)
    ...
    item_preciser_t get_itp(void)
    bool is_valid_anchor(void)
    bool is_citem_anchor(void)
    bool is_itp_anchor(void)
    ...
 };

… these other types of anchors are embedded in the string, and the citem_t anchor just happens to be all 0's. I do (think I) see them in a few places, such as this local variable anchor here:

  �(0000000040000007��void *v7���	;�	 // ��[xsp+48h] [xbp-8h]��

But I don't see them at all on some other lines where I would at least expect to see an ANCHOR_ITP for an ITP_SEMI item preciser, like this:

�(0000000000000031  �(0000000000000033��objc_release���(0000000000000032�	(�	�(0000000000000034��v1���	)�	�	;�	�(0000000000000031           

which corresponds to this line:

  objc_release(v1);

So, what gives? Why these anchors only on some lines?

Missing license information

Hi!

Great job on providing a higher level API for Hex-Rays decompiler!

There are a few bugs we've encountered when we first tried it and would like to contribute patches to fix them. However, the license isn't explicitly given. I know, it's GitHub, and you probably want people to fork, but legally we have no right to republish (fork) without your explicit consent. You probably also want to protect yourself and/or FireEye and keep a copyright notice and credit in forks.

It mentions MIT here: https://github.com/fireeye/FIDL/blob/master/setup.py#L22. Should it be MIT?

Once this is fixed we'll open pull requests with the fixes,

Thanks!
M-E

Feature request: ability to retrieve the `citem_t` corresponding to a given cursor position on a given line

Given some code like this:

33|   v28 = a3;
34|   v27 = objc_retain(CFSTR("/Library/MobileSubstrate/DynamicLibraries/libFLEX.dylib"));
35|   v3 = objc_msgSend(&OBJC_CLASS___NSFileManager, "defaultManager");
36|   v26 = (void *)objc_retainAutoreleasedReturnValue(v3);

with the cursor right here (column 38 in the code above) for example:

            v
OBJC_CLASS__|_NSFileManager
funcEA = ...
line = 35
cursor = 38
item = fidl.lex_citem_at_pos(funcEA, line, cursor)

# Prints obj
item.cexpr.opname
# Prints address of _OBJC_CLASS_$_NSFileManager
item.cexpr.obj_ea

The explicit cursor position is important to me as I'm not working directly in IDA; I would like to be able to query arbitrary locations without using IDA's cursor API.

`my_decompile` fails on IDA 7.0

I realize this repo doesn't strive to support Python 2 / IDA < 7.4, but I think a lot of it works out of the box except for this code right here: https://github.com/fireeye/FIDL/blob/e6ceb000cda43b450717eb171309c02dee06dd4f/FIDL/decompiler_utils.py#L1070-L1073

The version of this function present in 7.0 only accepts an address and a failure pointer, no flags. Would it be possible to somehow detect which one is available and call that? Or to add a new Python 2.7 and IDA < 7.4 release to call decompile() without the flags?

Decompiled ctree accessed before it's populated by hexrays

Hi,

I've noticed that in the latest version of IDA for Linux (7.5.200728) FIDL fails with followin error:

Python 3.8.5
[GCC 9.3.0] 
IDAPython v7.4.0 final (serial 0) (c) The IDAPython Team <[email protected]>
--------------------------------------------------------------------------------------
Python>import FIDL.decompiler_utils as du
Python>c = du.controlFlowinator(ea=here(),fast=False)
...
 -> OK
in method 'ctree_items_t___getitem__', argument 2 of type 'size_t'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/path/FIDL/FIDL/decompiler_utils.py", line 1403, in __init__
    self._generate_better_cfg()
  File "/path/FIDL/FIDL/decompiler_utils.py", line 1845, in _generate_better_cfg
    hi = citem2higher(obj)  # cinsn_t
  File "/path/FIDL/FIDL/decompiler_utils.py", line 530, in citem2higher
    if citem.is_expr():
AttributeError: 'NoneType' object has no attribute 'is_expr'

The error seems to be because ida_hexrays.decompile() has lazy population of ctree so if body.cblock is accessed directly before using some decompiler API method which forces population of structure it will return None.

Calling refresh_func_ctext() after decompile() fixes the issue.

LocByName is dperecated on IDA 7.4. Use get_name_ea_simple instead

On IDA 7.4, the LocByName function was removed and replaced by get_name_ea and get_name_ea_simple.

Using display_all_calls_to, which is the only API that uses LocByName, will end up with the following error.

Python>import FIDL.decompiler_utils as du
Python>du.display_all_calls_to("decryptString")
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\apps\fidl\FIDL\decompiler_utils.py", line 1309, in display_all_calls_to
    f_ea = LocByName(func_name)
NameError: name 'LocByName' is not defined

pseudoViewer Question

Hello!
First of all, I'd like to thank you for making FIDL public, it really helps!

I have a question regarding the following code from FIDL decompiler_utils.py:

if not has_cached_cfunc(ea):
    # Open the disassembly view here
    # to populate the cache
    pw = pseudoViewer()
    pw.show(ea=ea)
try:
    cf = decompile(ea=ea, flags=ida_hexrays.DECOMP_NO_WAIT)
except ida_hexrays.DecompilationFailure as e:
    print("Failed to decompile @ {:X}".format(ea))
    cf = None

This code creates a lot of decompiler windows and slows down or even hangs IDA while working with large databases. If I don't use pseudoViewer and just call decompile() for target functions, code works fine (at least as I see).
Can you explain why you decided to use pseudoViewer? What are its benefits comparing to omitting this code and just using decompile()?

Nonetype object has no is_expr

Python3 latest ida, hash: 3126280e968397a118b2e75e5349ef613e6ea5a1aca458c38a9271490458313c

import itertools
from idaapi import *
from idautils import *
from idc import *
import ida_hexrays
import FIDL.decompiler_utils as du

def string_decoder(args):
    if args and args[0].type == 'string':
        s = bytearray.fromhex(args[0].val)
        res = ""
        last = s[0]
        for a,b in zip(s[1:], itertools.cycle('IIDH47ETIBQRYOF258RYOSYW5XVMEYODH257Y')):
            x = a ^ ord(b)
            z = (x - last) % 255
            res+= chr(z)
            last = a
        return res

def get_func_start(xref):
    try:
        if len(get_func_name(xref.frm)) > 0:#get_func_start will crash IDA without this additional check
            if xref.iscode:
                func = get_func(xref.frm)
                find_func_bounds(func, idaapi.FIND_FUNC_DEFINE)
                return func.start_ea
        return 0
    except:
        return 0

def main():
    addr = 0x004808E0
    func_list = []
    if addr is not None:
        for xref in XrefsTo(addr, ida_xref.XREF_ALL):
            f = get_func_start(xref)
            if f == 0:
                print('xref outside of defined function %x' % xref.frm)
            else:
                func_list.append(f)
    func_list = set(func_list)
    count = 0
    for f in func_list:
        c = du.controlFlowinator(ea=f)
        for co in c.calls:
            if co.call_ea == addr:
                t = string_decoder(co.args)
                du.create_comment(co.c,co.ea,'%s' % (t))
                count+=1
    print('calls found %d' % (count))
    
if __name__ == '__main__':
    main()
Traceback (most recent call last):
  File "<string>", line 53, in <module>
  File "<string>", line 44, in main
  File "C:\Users\steve\AppData\Roaming\Python\Python38\site-packages\FIDL\decompiler_utils.py", line 1402, in __init__
    self._generate_better_cfg()
  File "C:\Users\steve\AppData\Roaming\Python\Python38\site-packages\FIDL\decompiler_utils.py", line 1844, in _generate_better_cfg
    hi = citem2higher(obj)  # cinsn_t
  File "C:\Users\steve\AppData\Roaming\Python\Python38\site-packages\FIDL\decompiler_utils.py", line 530, in citem2higher
    if citem.is_expr():
AttributeError: 'NoneType' object has no attribute 'is_expr'

Check out this corner case

  File "<string>", line 62, in main
  File "C:\Python27\lib\site-packages\FIDL\decompiler_utils.py", line 1388, in __init__
    self._generate_i_cfg(blocks_to_expand=blocks)
  File "C:\Python27\lib\site-packages\FIDL\decompiler_utils.py", line 1813, in _generate_i_cfg
    self._generate_i_cfg(blocks_to_expand=blocks_to_expand)
  File "C:\Python27\lib\site-packages\FIDL\decompiler_utils.py", line 1800, in _generate_i_cfg
    new_blocks = self._expand_switch_block(block)
  File "C:\Python27\lib\site-packages\FIDL\decompiler_utils.py", line 1609, in _expand_switch_block
    self.i_cfg.add_edge(case_ins[-2].index, succ)
IndexError: list index out of range```

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.