Not sure what's going on here, any ideas? 2017-06-21 15:21:20,505 [m

Mongodb reporting error about cape HOT 13 CLOSED

ctxis commented on June 25, 2024

Mongodb reporting error

from cape.

Comments (13)

marirs commented on June 25, 2024

Yes, thats the MongoDB document limit. If it crosses 16MB limit, Mongo cannot save. So the next step cuckoo does, is to see if it can delete some key and then attempt to save it. But out of luck. So that particular analysis will not be saved into mongo. If JSON report was enabled, then you should have report.json inside of storage/analysis//reports. But yet wont be displayed in the UI.

This happens sometimes, when you have lots of reporting stuff which exceeds the limit of the mongo document size.

To counter this somewhat, the compress results was the solution.

If you have pulled the latest from Kevin's repo, you can try to enable the compressresults in the reporting.conf file and restart cuckoo and try that sample again.

Let us konw how that goes :)

from cape.

enzok commented on June 25, 2024

This already has the compressresults enabled. I'm just curious why the delete failed. Could the delete failure be handled more gracefully, so that just the offending results key is omitted from the results instead of the result failing completely.

from cape.

kevoreilly commented on June 25, 2024

Hi enzok, I agree this failure should be handled more gracefully. I'll try and work out a way to do this - if you can share a sample hash please do.

from cape.

enzok commented on June 25, 2024

I modified mongodb.py with this code to remedy the issue (starting at ~ line 182):

    try:
        self.db.analysis.save(report)
    except InvalidDocument as e:
        parent_key, psize = self.debug_dict_size(report)[0]
        if not self.options.get("fix_large_docs", False):
            # Just log the error and problem keys
            log.error(str(e))
            log.error("Largest parent key: %s (%d MB)" % (parent_key, int(psize) / 1048576))
        else:
            # Delete the problem keys and check for more
            error_saved = True
            while error_saved:
                if type(report) == list:
                    report = report[0]

                try:
                    if type(report[parent_key]) == list:
                        for j, parent_dict in enumerate(report[parent_key]):
                            child_key, csize = self.debug_dict_size(parent_dict)[0]
                            del report[parent_key][j][child_key]
                            log.warn("results['%s']['%s'] deleted due to >16MB" % (parent_key, child_key))
                    else:
                        child_key, csize = self.debug_dict_size(report[parent_key])
                        del report[parent_key][child_key]
                        log.warn("results['%s']['%s'] deleted due to >16MB" % (parent_key, child_key))

                    try:
                        self.db.analysis.save(report)
                        error_saved = False
                    except InvalidDocument as e:
                        parent_key, psize = self.debug_dict_size(report)[0]
                        log.error(str(e))
                        log.error("Largest parent key: %s (%d MB)" % (parent_key, int(psize) / 1048576))
                except Exception as e:
                    log.error("Failed to delete child key: %s" % str(e))
                    error_saved = False

    self.conn.close()

Correct me if I'm wrong, but I don't believe that procdump results are being compressed. I think when there are too many yara strings, the results grow too large.

from cape.

kevoreilly commented on June 25, 2024

Ah yes, I will look at adding compression to procdump output too, as well as implementing the fix you have kindly posted above.

Thanks for your help.

from cape.

kevoreilly commented on June 25, 2024

I have now pushed this fix and enabled compression for procdump. Please let me know if this fixes (or alleviates) this issue.

from cape.

enzok commented on June 25, 2024

Thank you.

from cape.

enzok commented on June 25, 2024

Will compressing the report results affect elasticsearch db (search only)? I noticed I'm now getting serialization errors when storing data into elasticsearch.

from cape.

kevoreilly commented on June 25, 2024

Hmm possibly - I vaguely recall seeing problems previously with Elasticsearch and compression. Any chance you could provide some more details to help me try and narrow it down?

from cape.

enzok commented on June 25, 2024

It appears that that the compressed data doesn't serialize. I added the following code to the elasticsearchdb.py reporting module and it solved the issue.

import json
import zlib

~ line 137:

        try:
            report["summary"] = json.loads(zlib.decompress(results.get("behavior", {}).get("summary")))
        except:
            report["summary"] = results.get("behavior", {}).get("summary")

from cape.

marirs commented on June 25, 2024

I would rather do it this way:
Since you dont want the compressed results to sit in Elastic, and that the views can any ways parse if its compressed or not - you could change the order of these 2 processing files:

elasticsearchdb.py
Line 25:
order = 9998
Change to order = 9997

compressresults.py
Line 27:
order = 9997
Change to order = 9998

This way compressresults will be done after elasticsearch is reported.

from cape.

enzok commented on June 25, 2024

That works for me. I completely forgot about being able to set the order.

from cape.

kevoreilly commented on June 25, 2024

Ah fantastic - thanks both for finding and fixing this. I will make this change now.

from cape.

Mongodb reporting error about cape HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent