Git Product home page Git Product logo

Comments (18)

palazzem avatar palazzem commented on August 22, 2024

Thanks for reporting that @webandtech. Can you activate the debug mode for some minutes and write here some logs, looking for this specific line? if Sidekiq integration has an issue, it's possible that it doesn't close correctly the trace causing a (wrong) high memory usage.

Thank you!

from dd-trace-rb.

webandtech avatar webandtech commented on August 22, 2024

Unfortunately I was not able to reproduce this in an ad hoc run with debug enabled. It could have been related to the specific job running but I hesitate to re-enable the tracer on that process in production.

from dd-trace-rb.

palazzem avatar palazzem commented on August 22, 2024

@webandtech thanks for trying that! I will investigate more in our Sidekiq integration. In the meantime, can you send us more details about what APM integrations are you using other than Sidekiq? (feel free to open a ticket in our support channel!)

from dd-trace-rb.

colby-swandale avatar colby-swandale commented on August 22, 2024

We're having this same issue in a rake task, i will try and run in debug mode to get the information you need. But so far I've been unable to reproduce this issue in a non-production environment.

E, [2017-11-15T08:07:49.706444 #4979] ERROR -- ddtrace: [ddtrace] (/path/to/bundle/shared/bundle/ruby/2.3.0/gems/ddtrace-0.9.0/lib/ddtrace/transport.rb:137:in `handle_response') Client error: Request Entity Too Large

from dd-trace-rb.

palazzem avatar palazzem commented on August 22, 2024

We're currently working on a patch so that each single flush contains a reasonable payload (and doesn't wait the trace to be flushed). I think such kind of problems mostly happen for long-running tasks. With #247 we should reduce:

  • the overall memory usage, since we don't keep too much data in memory
  • the length of each single trace
  • you can start seeing some spans before the whole trace is completed.

I'll let you know when it has been merged + how you can try this opt-in functionality.

from dd-trace-rb.

webandtech avatar webandtech commented on August 22, 2024

That would make sense as the processes where we were seeing this were running longer sidekiq jobs.

from dd-trace-rb.

palazzem avatar palazzem commented on August 22, 2024

nice, let us test that approach (that actually requires an opt-in); then your feedback will be very relevant to us! thanks again @webandtech, especially for raising the problem!

from dd-trace-rb.

palazzem avatar palazzem commented on August 22, 2024

Hello @webandtech! sorry for the very late response, but the branch I wrote you before (#247) has been tested with success by us and other users. Would you like to provide some feedbacks so that we're sure it's solving also your problem?

After your feedback, I think we can add it in a new release. To use it you have to:

  • update your Datadog Agent to a version greater than 5.19.0
  • update your gem using our custom branch
  • add a configuration in your code
# Gemfile
gem 'ddtrace', :git => 'https://github.com/DataDog/dd-trace-rb.git', :ref => 'e63e190bd42207048368fe12790c78bc4e526695'

# in your Rails initializer, you should activate this functionality
Rails.configuration.datadog_trace = {...}

# NOTE: you can play with these values to set how many spans you want to keep in memory; defaults to 100k
Datadog.tracer.configure(
  priority_sampling: true,
  min_spans_before_partial_flush: 10, 
  max_spans_before_partial_flush: 1000
)

Thanks in advance!

from dd-trace-rb.

palazzem avatar palazzem commented on August 22, 2024

@webandtech hello! just want to follow up and see the status of this issue. Would be great receiving some feedbacks to improve or merge the proposed PR.

Thank you very much for helping us improving the Tracer!

from dd-trace-rb.

webandtech avatar webandtech commented on August 22, 2024

I haven't been able to reproduce this outside of production. I'll try to get this ref up on production shortly so we can test it there. Thanks!

from dd-trace-rb.

palazzem avatar palazzem commented on August 22, 2024

nice! @webandtech keep us updated so we can improve this PR if needed. Thanks!

from dd-trace-rb.

webandtech avatar webandtech commented on August 22, 2024

@palazzem With ref e63e190bd42207048368fe12790c78bc4e526695 in production and the settings you suggested things look good to me. I will keep watching it over they next day but memory on the problematic job queue looks to be within normal ranges after a few hours. Thanks!

from dd-trace-rb.

yahooguntu avatar yahooguntu commented on August 22, 2024

Any news on this? I've been running an older version of #247 for a few months now to keep the tracer from eating up all the RAM during a long-running job.

from dd-trace-rb.

delner avatar delner commented on August 22, 2024

@yahooguntu sounds like that PR has been helpful in addressing that issue. We're planning on integrating those changes back into the trunk in the near future.

from dd-trace-rb.

buddhistpirate avatar buddhistpirate commented on August 22, 2024

Consider this a vote for this issue. We are having the same problem on some heavy hitting sidekiq jobs. Trying to get Datadog APM into production and have run into a few issues; this is the current blocker. Currently running v0.11.2. I'm not sure I'm comfortable going to production with this PR until it is merged into a release.

from dd-trace-rb.

palazzem avatar palazzem commented on August 22, 2024

Hello @buddhistpirate ! actually this PR is scheduled to be merged for the next release. Probably we want to update it a bit just to be sure the behavior is completely isolated. Will keep you updated once the next release is out!

from dd-trace-rb.

buddhistpirate avatar buddhistpirate commented on August 22, 2024

@palazzem Awesome, thanks for the info.

from dd-trace-rb.

delner avatar delner commented on August 22, 2024

Hey @buddhistpirate @yahooguntu @webandtech we just merged #247, which should address this issue by implementing partial flushing, which is disabled by default. The feature is currently considered experimental.

To activate this, you need to set the following in your configuration:

Datadog.configure do |c|
  c.tracer partial_flush: true
end

Or if you wish to configure the behavior, you can set the following flags:

Datadog.configure do |c|
  c.tracer min_spans_before_partial_flush: 10, # Default 10
           max_spans_before_partial_flush: 100, # Default 100
           partial_flush_timeout: 60 # In seconds, default 60
end

Note there is now a hard cap on the number of spans in a trace, set to 100,000.

Also if you were previously pinned to #247 for this feature (which was based on an older version of the tracer), and you want to use the newly merged version instead, you may have to upgrade your Datadog configuration to be compatible with the new API. See our migration guide for details.

Let me know if this is working for you guys. Going to close this one for now.

from dd-trace-rb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.