Git Product home page Git Product logo

Comments (6)

shell-skrimp avatar shell-skrimp commented on June 24, 2024 1

It's very hard to reproduce. Heaptrack indicates it's only about ~6% of connections. I tried with a custom program that just did these checks endlessly and it never leaked. If I try on the program where I discovered this it takes 10+ hours for the leak to occur. I did my own custom changes to go-ceph; tried to remove runtime.SetFinalizer, tried directly freeing if conn.cluster != nil and it still leaked.

from go-ceph.

shell-skrimp avatar shell-skrimp commented on June 24, 2024

Going to give some context for any users that run into similar problem I have. Looking at the code you see how I handle the connection closing (via defer). For some reason that is not freeing up resources. I decided to use heaptrack and low and behold rados_create2 leaks (see pic).

leak

If I drop the defer and instead just add Destroy/Shutdown where needed (if error, and before exit). There is no memory leak.

ioctx.Destroy()
conn.Shutdown()

It's really weird, it's like the defer calls to conn.Shutdown() or ioctx.Destroy() do not get called.

from go-ceph.

shell-skrimp avatar shell-skrimp commented on June 24, 2024

@phlogistonjohn go-ceph indeed has a memory leak with how Shutdown is called. See above code and golang/go#43363 (comment).

If you return a *rados.Conn and then immediately defer conn.Shutdown() it will not be evaluated properly and will therefore leak.

I have worked around this by just calling the Shutdown()/Destroy() at the end of whatever it is I'm doing. However it may also be possible to defer func() { conn.Shutdown() }() to keep the code more idiomatic.

Since the defer is evaluated at defer time; the freeConn(c) and/or ensureConnected are buggy in some way.

Basically, the connection I return after conn.Connect() isnt "complete" enough for defer conn.Shutdown() to properly reap resources. I might experiment with it more but if done rapidly in a gourtine the connections can leak.

from go-ceph.

phlogistonjohn avatar phlogistonjohn commented on June 24, 2024

OK, thanks for the update. Without a lot of investigation on my part yet, an issue with Shutdown seems more plausible to me. I'm reopening this issue since it automatically got closed from the other PR. We'll look into it soon.

from go-ceph.

ansiwen avatar ansiwen commented on June 24, 2024

It's very hard to reproduce. Heaptrack indicates it's only about ~6% of connections. I tried with a custom program that just did these checks endlessly and it never leaked. If I try on the program where I discovered this it takes 10+ hours for the leak to occur. I did my own custom changes to go-ceph; tried to remove runtime.SetFinalizer, tried directly freeing if conn.cluster != nil and it still leaked.

@shell-skrimp So, I'm a bit confused now. Which of your findings from above (defer vs direct call etc.) are still valid? Now it sounds here like there is a leak no matter what. To be honest, it even sounds like there might be a race within ceph itself. But I just started to look into this, so it's just a gut feeling.

from go-ceph.

shell-skrimp avatar shell-skrimp commented on June 24, 2024

@ansiwen neither are valid. I thought that direct calling was better than defer because testing showed initially that there was no memory leak, but in the end there was still a memory leak, it just took thousands of new connections to the ceph cluster to reproduce.

What I did in my testing:

  • Try removal of runtime.SetFinalizer; no difference.
  • Try removal of defer and directly free/close/destroy; no difference.
  • Change Shutdown to if c.cluster != nil { c.rados_shutdown(...} (going by memory on this one); no difference

In the mean time I switched to a long lived ceph connection and that seems to have fixed issue for now.

from go-ceph.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.