Git Product home page Git Product logo

Comments (13)

kkolyan avatar kkolyan commented on September 22, 2024 2

Yet another report with "memory leak" reproduction: https://github.com/kkolyan/jackson_leak.
It uses ObjectMapper from 1000 threads and during an hour occupy heap of 8G (with initial occupation by test data of 600M), leading to permanent freeze.

from jackson-core.

mariofusco avatar mariofusco commented on September 22, 2024 1

HI @cowtowncoder I would like to investigate this memory issue, but at the moment I have no clue on how to reproduce it on my side. I also asked here and I'm waiting for an answer. I'm reviewing its implementation again, and in all honesty it seems quite straightforward, so at the moment I don't know what to do about it. About making the LockFree pool bounded, this is of course feasible, but I'm against for a few reasons:

  1. In general I don't like blind fixes
  2. In this specific case making the pool bounded won't help as to investigate the problem, but only to put it under the carpet
  3. Making that pool bounded won't be free however, both adding some further complexity to it and introducing a performance cost
  4. We already have a bounded pool implementation, with performance very similar to the LockFree, so if we want a bounded pool as default implementation, let's just use it

from jackson-core.

mariofusco avatar mariofusco commented on September 22, 2024 1

Thinking twice, now I see an admittedly weird situation where the LockFree pool can grow indefinitely. In that pool there is an heuristic to limit the number of retries under heavy contention. When this happens it gives up attempting to retrieve a resource from the pool and creates a new one instead. However this heuristic is implemented symmetrically on both acquire and release, so similarly on release it gives up trying to put back a resource into the pool after a few failed attempts. In essence the pool will grow if there is a huge asymmetry in its usage, when the acquire is always under heavy contention but the release isn't. I'm not sure how to deal with this situation to be honest, but still I would like to see it happens into the wild before deciding on any action.

from jackson-core.

cowtowncoder avatar cowtowncoder commented on September 22, 2024 1

I may be wrong, but to me it seems like ConcurrentDequePool might be the safer but similar implementation to use -- it uses Deque for storing pooled objects.
And compared to BoundedPool it has the benefit of lower overhead for cases where pool itself is not used (since there's no allocations with Deque, it being linked list).

from jackson-core.

cowtowncoder avatar cowtowncoder commented on September 22, 2024

/cc @mariofusco @pjfanning -- I hope to learn if there is some specific usage scenario and create multi-threaded reproduction. I am suspecting that somehow reuse from the pool is failing, but perhaps not for every call. jackson-benchmarks, for example, does not exhibit this problem: but it also does not by default use heavy concurrency and tests do not run long enough to necessarily accumulate humongous buffers.

Also: it is interesting that the one concern I did have wrt unbounded pools ended up happening, it seems.
So it feels like my insistence on ideally having maximum size to avoid this problem (... although, doing so might hide an issue to) seems warranted. For whatever that is worth.

from jackson-core.

cowtowncoder avatar cowtowncoder commented on September 22, 2024

@mariofusco Your thoughts align remarkably closely with my thinking :)
(not that this guarantees we are necessarily right but it is ok sanity check I think)

As to reproduction, what I was thinking was indeed running a multi-threaded scenario with N threads, reading and writing, against modified copy of pool which allows accessing size, printing it periodically and checking if size would be growing.
I guess one unfortunate part is that any "lost" acquire will add to size and although in some use cases the opposite could occur too (release does not happen on some code path, I assume that is possible too), there are probably cases where release works reliably so even small incremental unbalance will lead to increase retention over time.

As to adding bounded size: I 100% agree that we need to understand the problem first and not add something that would likely just hide the problem.
Once we know what is going on we can reconsider it.
(I have some ideas on how it could work but not actual plan; avoiding need to use single synchronized counter)

from jackson-core.

franz1981 avatar franz1981 commented on September 22, 2024

I am now reading past the issues and probably giving up on contention is not what we want.
Contention is an inherent problem of concurrent primitives like (compareAndSet-like), but allocating while still being able to release it back, can just causes unbounded growth, which eventually is going again to consume the cpu we tried hard to save (spending it in GC marking and eventually moving, given that when grow it doesn't shrink).
Right now the easier choice imo is to use a counter on acquire/release to keep on trying with spinning and Thread::onSpinWait or yield is there are too many attempts, but without giving up.

Anothe solution can be to spread the contention or uses better primitives which "cannot fail" eg getAndSet

But that means redesigning what it does; just serialising while too contended seems the easier choice to me.

from jackson-core.

cowtowncoder avatar cowtowncoder commented on September 22, 2024

@mariofusco @franz1981: I created #1265 which sort of hints at possible problem, but isn't smoking gun necessarily. When running through 3 main implementations, LockFreePool occasionally gets a REALLY high pooled count -- I have seen 18,000 once (!) -- but seems to stabilize over time. Other implementations remain typically under thread count (going over isn't unexpected since test runs read and write for some loops, so getting to 2x is theoretically possible).

So no, this does not show monotonically increasing behavior, but suggests that spikes are possible.
It is of course possible (likely?) that contention behavior is heavily platform-dependant and maybe running on Eclipse on Mac mini just doesn't exhibit extreme contention.
I'll see if increasing thread count from 100 up (or, going down to, say, 20) can jog other behavior.

But if anyone else has time to run test, and/or read code and suggest changes, that'd be useful.

from jackson-core.

cowtowncoder avatar cowtowncoder commented on September 22, 2024

@kkolyan Thank you very much for creating this! I hope to try it out locally, and see if the other 2 implementations (and in particular, Deque backed ones) fared better.

EDIT: looks like it's not trivial to change impl to measure other impls due to RecyclerPoolSizeCalculator (I used 2.18.0-SNAPSHOT where RecyclerPool.pooledCount() was added).

from jackson-core.

cowtowncoder avatar cowtowncoder commented on September 22, 2024

@kkolyan Trying to run the reproduction, but since I am not a gradle user, not sure what invocation is expected. Guessing from build.gradle tries

./gradlew test
./gradlew application

neither of which works (build succeeds with ./gradlew build).
It'd be good to add invocation on README.md of the repo?

Also had to add mavenLocal() repository to use jackson-core locally built 2.17.1-SNAPSHOT (or could have added Sonatype OSS repository that has those built too).

from jackson-core.

cowtowncoder avatar cowtowncoder commented on September 22, 2024

I'm able to run the test on IDE just fine. Not yet seeing much of a trend, assuming I read output correctly.

But one related thing is this: I backported addition of

    default int pooledCount() {
        return -1;
    }

in RecyclerPool for 2.17 branch, so locally built (or one from Sonatype OSS repo) jackson-core has that available. This is useful to:

  1. Remove need for RecyclerPoolSizeCalculator since RecyclerPool.pooledCount() can be used
  2. Allow use of all RecyclerPool implementations to see that they don't (... or do?) exhibit similar leakage

from jackson-core.

cowtowncoder avatar cowtowncoder commented on September 22, 2024

Ok: created and resolved work (#1256 for 2.17.1 revert, #1266 for 2.18 update to concurrentDequePool).

One last thing: should we try to fix "lockFreePool", or deprecate? (and remove from 3.0).

Create a placeholder ticket: #1271.

from jackson-core.

cowtowncoder avatar cowtowncoder commented on September 22, 2024

Ok, I think we have a reasonable idea as to how the retention may happen: due to imbalanced failures between acquiring reusable buffer (fails more often) vs releasing buffer to the pool (fails less often).
Given this, closing the issue; will proceed with #1271.

from jackson-core.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.