Git Product home page Git Product logo

Comments (8)

qinzuoyan avatar qinzuoyan commented on August 18, 2024

这个问题我也在考虑中,有几个问题:
(1) 提高bufhandle(即block)的大小:提高到多少比较合适?这应当是个临界值,超过这个值后,增加block大小已经无法带来延迟收益;这个临界值根据机器配置和带宽不同应当是不同的,需要给出确定这个值的办法
(2) 小block的好处是避免小请求对memory的浪费,因为发送时一个block就只被一个请求独占,如果block太大,就会有造成浪费,尤其系统中有大量小请求的时候会更明显,吃内存严重
(3) 一种折中是:buffer里面的block是随着分配而逐渐增长的,譬如第一个4K,第二个8K,第三个16K...直到达到(1)中提到的临界值就不再增加

以上我一直都想实现来着,如果你有更好的建议可以进一步讨论。

from sofa-pbrpc.

qinzuoyan avatar qinzuoyan commented on August 18, 2024

优化已经完成,参见 0dbfbbd8acfe521736cd3

主要优化点:

  • 在block的隐藏头部中增加block_size字段;WriteBuffer扩展时新分配的tranbuf大小按照指数增长(1k、2k、4k、...),直到最大值(32k),在“降低大数据传输的延迟”和“避免memory浪费”之间进行权衡;ReadBuffer由于在多个buffer之间可以共享block,不会有memory浪费,所以在rpc_message_stream中接收数据时总是使用最大值(32K)。
  • 设置tcp_no_delay,避免因为Nagle's Algorithm造成最后一个包的额外延迟。参见http://www.cnblogs.com/polymorphism/archive/2012/12/10/High_Latency_for_Small_Size_Entities_in_Table_Service.html 。但是这会使qps有少量下降,可以通过SOFA_PBRPC_TCP_NO_DELAY宏来选择是否开启该功能(默认开启)。

优化结果(通过运行test/perf_test/test_delay.sh测试得到):

单条请求携带数据量0.1KB1KB10KB100KB1MB10MB
优化前延迟(单位微秒)5674799423153553439124989
优化后延迟(单位微秒)5357186533378225234
- 对于大数据,延迟降低明显 - 10KB的明显提升是因为设置了tcp_no_delay

from sofa-pbrpc.

CaesarTang avatar CaesarTang commented on August 18, 2024

请问tranbuf 大小的最大值为什么为32k?如果设置更大会有什么问题?

from sofa-pbrpc.

bluebore avatar bluebore commented on August 18, 2024

首先32k够大了,更大也不会提升吞吐,实测10k时就可以达到1.2GB/S,占满万兆网卡带宽了。
这个优化已经merge到主干,并在v1.1版本中了发布了

from sofa-pbrpc.

qinzuoyan avatar qinzuoyan commented on August 18, 2024

已经发布到:https://github.com/baidu/sofa-pbrpc/releases/tag/v1.1.0

from sofa-pbrpc.

CaesarTang avatar CaesarTang commented on August 18, 2024

Hi,使用中发现:如果协议的request message字段为repeated,且size比较大(数百个),且字段message是个复合结构(内部同样存在repeated字段),request序列化长度在200k左右,BinaryRpcRequest::ProcessRequest方法的执行时间会很长,达到几十毫秒+,请问是proto buffer的反序列化性能瓶颈造成的吗?是在预计范围之内吗?

from sofa-pbrpc.

qinzuoyan avatar qinzuoyan commented on August 18, 2024

我觉得是的,你可以自己单独测试一下使用protobuf进行序列化和反序列的性能,以验证你的猜测。
如果你能把测试出来的结果发到这里看看更好,我们可以考虑做一些优化。

from sofa-pbrpc.

qhsong avatar qhsong commented on August 18, 2024

我们在测试的时候也发现repeated在序列化的时候非常耗时。。

from sofa-pbrpc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.