Hi, First off wanted to say thank you for open sourcing this excelle

Actually, bulk mutations are already in master as of <a class="commit-link" data-hover

Poor write performance (BigTable) about heroic HOT 2 CLOSED

spotify commented on May 27, 2024

Poor write performance (BigTable)

from heroic.

Comments (2)

udoprog commented on May 27, 2024

In Heroic the /write endpoint blocks until the data has been committed both in Elasticsearch* and Bigtable.

These could be potential explanations:

The Elasticsearch* index operation might take a bit of time. This has a write-through cache that can be enabled so that only the first write will suck every n minutes, you should attempt to route writes to the same node as much as possible to help this out.
Some layer in OpenTSDB/HBaseClient performs batching, and you are not seeing the roundtrip to Bigtable - only the time it takes to queue up the write.
Heroic issues individual mutations for every datapoint in parallel, and we are hitting some resource constraint/issue regarding that.

*: assuming you have it configured to use Elasticsearch.

I don't think option 2 is very likely, ~30-40 ms seems around the time I'd expect it to be (assuming the API node lives in the same region). The query path supports tracing. We could add something similar to the write path as well to get better insights.

In general we are aiming at supporting sustained throughput over individual request latency. So you should expect to have a couple pending in-flight writes at all times. I'd recommend making use of HTTP/2 to multiplex these over a single connection or to use a queue (like Kafka) that acts as a buffer.

@juruen is also working on incorporating a batch writer provided by the client library which in our experiments has improved throughput quite a bit. In it's current implementation it would probably not help much regarding the latency for individual writes. The batches are flushed every two seconds by default, and your latency would depend on how well you hit that window. But we might implement a looser consistency guarantee with this that would queue up writes instead of committing them. This would run the risk of losing some acked writes if the node goes down but provide better latency.

from heroic.

juruen commented on May 27, 2024

Actually, bulk mutations are already in master as of bc82c27

Batches are automatically flushed when they reach 25 metrics (bigtable client 0.9.3) or 100 metrics (bigtable client 0.9.2), otherwise they will get flushed after 2 seconds.

Depending on how you are generating your metrics in your test and which BT client library is being used, it may have an impact on latency.

from heroic.

Recommend Projects

Poor write performance (BigTable) about heroic HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent