Git Product home page Git Product logo

Comments (12)

arquicanedo avatar arquicanedo commented on July 20, 2024

And this is what Ghidra shows:

ERROR:jfx_bridge.bridge:InteractiveListener failed trying to callback client: Didn't receive response 30af74a4-027b-4da8-85ac-0589445d8efa before timeout

from ghidra_bridge.

justfoxing avatar justfoxing commented on July 20, 2024

Yeah, I don't have any Mac gear to test on, so hopefully this isn't something OS-specific.

If you don't mind doing a little information gathering, that'd be really helpful:

  1. Could I get a copy of whatever script/commands you're running that's reliably causing the issue (including the commands to set up the bridge)?
  2. If you disable interactive mode, do you get the same issue? You can try this with creating the bridge like b = ghidra_bridge.GhidraBridge(namespace=globals(), interactive_mode=False), but note that it may cause other issues - I'm just interested in trying to narrow down the cause.
  3. Can I get you to try to reproduce the issue with client and server debug logging turned on, and upload the logs?

To setup client logging, create the bridge with the loglevel argument set to logging.DEBUG - something like:

import ghidra_bridge
import logging

b = ghidra_bridge.GhidraBridge(namespace=globals(), loglevel=logging.DEBUG)

That'll spam the debug messages to your terminal as the script runs.

To setup server logging, modify the ghidra_bridge_server.py in the directory you installed the server scripts to (e.g., ~/ghidra_scripts), and replace the first non_comment line in run_server() (should be line 122) with the following, then completely close down Ghidra and restart it from scratch.

        server = bridge.BridgeServer(server_host=server_host,
                            server_port=server_port, loglevel=logging.DEBUG, response_timeout=response_timeout)
        handler = logging.FileHandler('ghidra_bridge_server.log')
        handler.setFormatter(logging.Formatter(fmt='%(asctime)s %(name)s %(levelname)s %(message)s'))
        server.logger.addHandler(handler)  

That should output a ghidra_bridge_server.log file in the directory you started Ghidra from (because I've had issues trying to copy logs out of the tiny Ghidra scripting console).

Thanks a lot!

from ghidra_bridge.

arquicanedo avatar arquicanedo commented on July 20, 2024

@justfoxing sure. I'll do it one step at a time. Here is another error that just popped up on the second run of the program. Worth mentioning that my script is a jupyter notebook. This may be introducing some additional complexity to the pipeline.

In all instances I'm launching the ghidra_bridge_server_background.py directly from the Ghidra menu. Anyway, here is another common error I get. This time on the second run of this snippet:

import ghidra_bridge
import math
b = ghidra_bridge.GhidraBridge(namespace=globals())

Exception: Didn't receive response 748897a0-f665-489a-9d3f-eb67ca1f9a12 before timeout

Ghidra simply shows this:

WARNING:jfx_bridge.bridge:Handling connection from ('127.0.0.1', 60044)

and the corresponding error (interactive mode as per the default config)

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in get_response(self, response_id, timeout)
    575             # wait for the data
--> 576             data = response.get(timeout)
    577         except:

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in get(self, timeout)
    539         if not self.event.wait(timeout):
--> 540             raise Exception()
    541 

Exception: 

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
<ipython-input-8-2d46b219efff> in <module>()
      1 import ghidra_bridge
      2 import math
----> 3 b = ghidra_bridge.GhidraBridge(namespace=globals())

/Users/canedo/anaconda3/lib/python3.7/site-packages/ghidra_bridge/ghidra_bridge.py in __init__(self, connect_to_host, connect_to_port, loglevel, namespace, interactive_mode, response_timeout, hook_import)
     60             # track the namespace we loaded with - if we're part of an __enter__/__exit__ setup, we'll use it to automatically unload the flat api
     61             self.namespace = namespace
---> 62             self.get_flat_api(namespace=self.namespace)
     63 
     64     def get_flat_api(self, namespace=None):

/Users/canedo/anaconda3/lib/python3.7/site-packages/ghidra_bridge/ghidra_bridge.py in get_flat_api(self, namespace)
     70         """
     71 
---> 72         remote_main = self.remote_import("__main__")
     73 
     74         if namespace is not None:

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in remote_import(self, module_name)
   1374 
   1375     def remote_import(self, module_name):
-> 1376         return self.client.remote_import(module_name)
   1377 
   1378     def remote_eval(self, eval_string, timeout_override=None, **kwargs):

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in wrapper(self, *args, **kwargs)
    163         if self.stats is not None:
    164             self.stats.add_hit(func.__name__)
--> 165         return func(self, *args, **kwargs)
    166 
    167     return wrapper

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in remote_import(self, module_name)
    954         self.logger.debug("remote_import {}".format(module_name))
    955         command_dict = {CMD: IMPORT, ARGS: {NAME: module_name}}
--> 956         return self.deserialize_from_dict(self.send_cmd(command_dict))
    957 
    958     @stats_hit

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in wrapper(self, *args, **kwargs)
    174     def wrapper(self, *args, **kwargs):
    175         start_time = time.time()
--> 176         return_val = func(self, *args, **kwargs)
    177         total_time = time.time() - start_time
    178 

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in send_cmd(self, command_dict, get_response, timeout_override)
    816             # wait for the response
    817             response_dict = self.response_mgr.get_response(
--> 818                 cmd_id, timeout=timeout_override if timeout_override is not None else self.response_timeout)
    819 
    820             if response_dict is not None:

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in get_response(self, response_id, timeout)
    577         except:
    578             raise Exception(
--> 579                 "Didn't receive response {} before timeout".format(response_id))
    580 
    581         if TYPE in data:

Exception: Didn't receive response 748897a0-f665-489a-9d3f-eb67ca1f9a12 before timeout

from ghidra_bridge.

arquicanedo avatar arquicanedo commented on July 20, 2024

Attached the jupyter notebook that reliably reproduces the errors on MacOS. I just got this error that seems different to others reported above.


---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in get_response(self, response_id, timeout)
    575             # wait for the data
--> 576             data = response.get(timeout)
    577         except:

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in get(self, timeout)
    539         if not self.event.wait(timeout):
--> 540             raise Exception()
    541 

Exception: 

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
<ipython-input-27-cb6df5a61ed3> in <module>()
----> 1 analyze()

<ipython-input-26-0dc0b11c999f> in analyze()
      5     print("Program's start address = %s and end address %s" % (start, end))
      6 
----> 7     caller, callee = func_call_distance(start, end)
      8     print(caller)
      9     print(callee)

<ipython-input-25-97cb7d7f64e7> in func_call_distance(f_start, f_end)
      7 
      8     while function is not None:
----> 9         dist_1 = func_callers_distance(function)
     10         dist_2 = func_callee_distance(function)
     11         distance_caller.append(dist_1)

<ipython-input-23-975b31aeaa5c> in func_callers_distance(f)
      1 def func_callers_distance(f):
      2     distance_list = []
----> 3     taskMonitor = ghidra.util.task.TaskMonitor.DUMMY
      4     calling_functions = f.getCallingFunctions(taskMonitor)
      5     for xref in calling_functions:

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in __getattribute__(self, attr)
   1496         else:
   1497             try:
-> 1498                 result = self._bridged_get(attr)
   1499             except BridgeException as be:
   1500                 # unwrap AttributeErrors if they occurred on the other side of the bridge

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in _bridged_get(self, name)
   1517             return self._bridge_overrides[name]
   1518 
-> 1519         return self._bridge_conn.remote_get(self._bridge_handle, name)
   1520 
   1521     def _bridged_get_all(self):

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in wrapper(self, *args, **kwargs)
    163         if self.stats is not None:
    164             self.stats.add_hit(func.__name__)
--> 165         return func(self, *args, **kwargs)
    166 
    167     return wrapper

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in remote_get(self, handle, name)
    829         self.logger.debug("remote_get: {}.{}".format(handle, name))
    830         command_dict = {CMD: GET, ARGS: {HANDLE: handle, NAME: name}}
--> 831         return self.deserialize_from_dict(self.send_cmd(command_dict))
    832 
    833     @stats_hit

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in wrapper(self, *args, **kwargs)
    174     def wrapper(self, *args, **kwargs):
    175         start_time = time.time()
--> 176         return_val = func(self, *args, **kwargs)
    177         total_time = time.time() - start_time
    178 

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in send_cmd(self, command_dict, get_response, timeout_override)
    816             # wait for the response
    817             response_dict = self.response_mgr.get_response(
--> 818                 cmd_id, timeout=timeout_override if timeout_override is not None else self.response_timeout)
    819 
    820             if response_dict is not None:

/Users/canedo/anaconda3/lib/python3.7/site-packages/jfx_bridge/bridge.py in get_response(self, response_id, timeout)
    577         except:
    578             raise Exception(
--> 579                 "Didn't receive response {} before timeout".format(response_id))
    580 
    581         if TYPE in data:

Exception: Didn't receive response 72fb99e2-301c-43ae-8b6f-2aaeb7a97568 before timeout

​

from ghidra_bridge.

arquicanedo avatar arquicanedo commented on July 20, 2024

ghidra_bridge_timeout.ipynb.zip

from ghidra_bridge.

arquicanedo avatar arquicanedo commented on July 20, 2024

I'll now reproduce with the client/server logging turned on and report back probably tomorrow.

from ghidra_bridge.

justfoxing avatar justfoxing commented on July 20, 2024

Oh, sweet, I keep wanting to take the time to post an example ghidra_bridge Jupyter notebook but never get around to it, so stoked that you're using it like that!

So, I had a look at your notebook and while I'm not seeing anything to help me reproduce the connection resets you mentioned at the start, I reckon I might know what the timeout problem is. Basically, when the bridge starts getting lots of requests in quick succession, it'll often slow down enough to start timing out. Unsure as to why - I haven't been able to track down any leaks in my code; it could just be that Python isn't the most performant language in the world. Don't worry, there is a fix.

Most of the requests in this case are in your func_callers_distance() and func_callee_distance() functions. The hottest part is in the for loops:

...
    for xref in calling_functions:
        dist = abs(xref.getEntryPoint().subtract(f.getEntryPoint()))
        distance_list.append(dist)
...

In particular, this line dist = abs(xref.getEntryPoint().subtract(f.getEntryPoint())) represents 6 bridge calls (one to get the getEntryPoint method from xref, one to call it, one to get the getEntryPoint method from f, one to call it, one to get the subtract method from the result of the first getEntryPoint, and one to call it with the result from the second entry point) for each calling function, so that could rapidly add up to a lot of bridge calls for functions with a lot of callers/callees.

(To be clear, I'm not shitting on your code here, I think it's good! It's just a really great example of this problem, and I'm trying to work out a better way to document this for the future)

What you could do to avoid this is to change to using the remote_eval() function to evaluate a list comprehension. This might look something like the following:

def func_callers_distance_re(f): # func_callee_distance_re would look very similar
    # Note: remote_eval has the ghidra module already imported, and you can pass in values like f as keyword arguments
    distance_list = b.remote_eval("[abs(xref.getEntryPoint().subtract(f.getEntryPoint())) for xref in f.getCallingFunctions(ghidra.util.task.TaskMonitor.DUMMY)]", f=f) 
    return distance_list

This will run as one bridge call for the whole function, instead of (6 * the number of xrefs) + a handful of bridge calls for the setup. So, not only will it be less likely to cause a load-related timeout, the whole script will be a lot faster (a quick test suggests at least two orders of magnitude).

If you can, give that a try and let me know if that helps.

As a note, I haven't done a lot of intensive work in Jupyter before, so this was the first time I noticed that it's keeping the bridges open even when you rerun the cells that start the bridge. This does seem like it contributes to the problem - I had a couple of runs where I'd end up with bridges timing out sooner and sooner until I couldn't get anything from them at all - so you may want to try restarting the kernel when that happens.

from ghidra_bridge.

arquicanedo avatar arquicanedo commented on July 20, 2024

@justfoxing thanks for the insights. I will try the remote_eval() in the next couple of days and report back. I like the idea of having the code run 100x faster and closer to the native Ghidra-python speed.

I will keep this in mind when writing ghidra_bridge code. Try to bundle as many Ghidra API calls as possible into a single remote_eval.

I still owe you the debugging outputs. Will be doing that between today and tomorrow.

Jupyter notebooks were a great way to get familiar with ghidra_bridge and the Ghidra API, but I think I'll just switch to plain scripts. I've noticed the same, once Jupyter times out, it cascades and becomes unusable. Instead of restarting the Jupyter kernel I was restarting the server from Ghidra.

from ghidra_bridge.

justfoxing avatar justfoxing commented on July 20, 2024

I will keep this in mind when writing ghidra_bridge code. Try to bundle as many Ghidra API calls as possible into a single remote_eval.

I wouldn't do it without a reason - it definitely makes the code harder to read/understand, and "premature optimization is the root of all evil". Removing a couple of calls here and there isn't going to make a noticeable difference, but a chunky for loop that can be turned into a list comprehension can really pay off.

Other things you can do are to pull values that get reused out of loops/functions so they're only bridged once - for example, instead of doing taskMonitor = ghidra.util.task.TaskMonitor.DUMMY inside the function (4 bridge calls every time the function runs), it could have been pulled out to be a global variable.

I've just pushed an update to ghidra_bridge (0.2.5) that allows you to gather statistics about bridge runs to help with optimising stuff like this.

import ghidra_bridge

b= ghidra_bridge.GhidraBridge(namespace=globals(), record_stats=True)
start_stats = b.get_stats()

# do some stuff

end_stats = b.get_stats()
print(end_stats - start_stats)

Should give you something like Stats(total_hits=3606,hits={'remote_get_type': 5, 'remote_get': 1146, 'remote_del': 1755, 'remote_call': 700},total_time=(3606, 20.095478296279907),times={'send_cmd': (3606, 20.095478296279907)})

Total hits is the number of times a bridge call goes out, so that's the number you want to see reduce significantly to justify a messy remote_eval().

from ghidra_bridge.

arquicanedo avatar arquicanedo commented on July 20, 2024

@justfoxing, here some pending results:

  • disabling interactive mode results in the same timeout error.
  • logging.DEBUG throws this on the Jupyter console IOStream.flush timed out. The logging output is just too big to share. But somehow the logging seems to fix the problem. I didn't get a crash in 3 consecutive runs.
  • Enabling the ghidra_bridge_server.log is not very informative.
    ghidra_bridge_server.log after a crash.
  • As you predicted, the remote_eval() fixes the problem and it is also super fast

Thanks for pushing 0.2.5. I will upgrade and continue using ghidra_bridge.

In terms of this issue, I think we can close it. Thanks for all the help!

from ghidra_bridge.

arquicanedo avatar arquicanedo commented on July 20, 2024

One more thing, I just found the example_py3_from_ghidra_bridge.py. I think this is the best way to use ghidra_bridge. Take advantage of the Ghidra-python infrastructure but make use of external py3.7 pacakges. I'll continue using it this way. This is a great package. Thanks for making it available.

from ghidra_bridge.

justfoxing avatar justfoxing commented on July 20, 2024

Glad we got something that worked for you!

from ghidra_bridge.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.