Git Product home page Git Product logo

Comments (7)

max-leuthaeuser avatar max-leuthaeuser commented on September 22, 2024

To exclude multithreading issues (as we have seen on other recent bugreports) could you run it with -J-XX:ActiveProcessorCount=1?

from joern.

wildoranges avatar wildoranges commented on September 22, 2024

To exclude multithreading issues (as we have seen on other recent bugreports) could you run it with -J-XX:ActiveProcessorCount=1?

with -J-XX:ActiveProcessorCount=1 option, both two machines can successfully run the joern-parse command. Thanks.

from joern.

wildoranges avatar wildoranges commented on September 22, 2024

To exclude multithreading issues (as we have seen on other recent bugreports) could you run it with -J-XX:ActiveProcessorCount=1?

more info: It seems that joern-parse might not work properly with multiple CPUs. I was able to successfully generate cpg.bin when using -J-XX:ActiveProcessorCount=64. However, when using -J-XX:ActiveProcessorCount=128, the aforementioned error occurs(this machine has two AMD EPYC 7763 64-Core Processor).

from joern.

max-leuthaeuser avatar max-leuthaeuser commented on September 22, 2024

@johannescoetzee @DavidBakerEffendi @bbrehm
So that worries me somewhat. We have quite some beefy machines here with tons of processors / RAM.
Could the implementation of the parallel passes (e.g., ForkJoinParallelCpgPass) be the issue here?
Or some limitation w.r.t. ODB with very high thread counts?

@wildoranges Thanks for the detailed report! We'll have a look.
No issues with 64 vs. 128 failing is a quite interesting result.

from joern.

wildoranges avatar wildoranges commented on September 22, 2024

more info: after my testing on my two machines, -J-XX:ActiveProcessorCount=123 has no errors, but processor count > 123 will generate the aforementioned error. 123 seems to be a boundary.

from joern.

bbrehm avatar bbrehm commented on September 22, 2024

Ok, the bug is here.

If you run the pass with 7 nodes that require linking, and have 16 processors available, then you end up with a batch size of 7/16 which is zero. This throws in the grouped iterator thing.

If you run the pass with 101 nodes that require linking, and have 102 processors available, then you end up with a batch size of 101/102 which is zero. This throws in the grouped iterator thing

There is a one-line quickfix -- just set the batchsize to 100 and be done with that nonsense, starting a thread ain't worth it for fewer items anyways. Or if you insist, just ensure that the batchsize is at least 1.

There is an actual fix as well: This code is really really bad. So I'll need to take a look how this got into the codebase in the first place, and talk to both author and reviewers and presumably write another "how not to shoot into your foot when multi-threading" primer.

I'm out with sick kid for today, so I'll look into it once I'm back. Feel free to merge a quickfix in the meantime.

from joern.

DavidBakerEffendi avatar DavidBakerEffendi commented on September 22, 2024

Possibly related? #4596

According to this bug fix, it was introduced in #4227, and unfortunately, @bbrehm, you were also a reviewer šŸ˜‰

It was a pretty big, complex, and drawn-out PR however, with lots of iterations, but mostly was customer impacting in that the original concurrency of this source finder was not scaling well.

from joern.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.