Git Product home page Git Product logo

Comments (2)

udoprog avatar udoprog commented on May 27, 2024

In Heroic the /write endpoint blocks until the data has been committed both in Elasticsearch* and Bigtable.

These could be potential explanations:

  1. The Elasticsearch* index operation might take a bit of time. This has a write-through cache that can be enabled so that only the first write will suck every n minutes, you should attempt to route writes to the same node as much as possible to help this out.
  2. Some layer in OpenTSDB/HBaseClient performs batching, and you are not seeing the roundtrip to Bigtable - only the time it takes to queue up the write.
  3. Heroic issues individual mutations for every datapoint in parallel, and we are hitting some resource constraint/issue regarding that.

*: assuming you have it configured to use Elasticsearch.

I don't think option 2 is very likely, ~30-40 ms seems around the time I'd expect it to be (assuming the API node lives in the same region). The query path supports tracing. We could add something similar to the write path as well to get better insights.

In general we are aiming at supporting sustained throughput over individual request latency. So you should expect to have a couple pending in-flight writes at all times. I'd recommend making use of HTTP/2 to multiplex these over a single connection or to use a queue (like Kafka) that acts as a buffer.

@juruen is also working on incorporating a batch writer provided by the client library which in our experiments has improved throughput quite a bit. In it's current implementation it would probably not help much regarding the latency for individual writes. The batches are flushed every two seconds by default, and your latency would depend on how well you hit that window. But we might implement a looser consistency guarantee with this that would queue up writes instead of committing them. This would run the risk of losing some acked writes if the node goes down but provide better latency.

from heroic.

juruen avatar juruen commented on May 27, 2024

Actually, bulk mutations are already in master as of bc82c27

Batches are automatically flushed when they reach 25 metrics (bigtable client 0.9.3) or 100 metrics (bigtable client 0.9.2), otherwise they will get flushed after 2 seconds.

Depending on how you are generating your metrics in your test and which BT client library is being used, it may have an impact on latency.

from heroic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.