Git Product home page Git Product logo

Comments (9)

holys avatar holys commented on August 17, 2024

@qinix

  1. 很抱歉,loader 统计耗时的地方有点小问题,并不是 UPDATE tidb_loader.checkpoin 耗时上百秒,而是整个事务的耗时。这个事务主要有三条 sql。use [db]; insert into xxxx (batch insert); UPDATE tidb_loader.checkpoint ... ;不过有一点可以确定的是,当时 tidb 真的很慢。

  2. 请问您的这 6 台服务器都是什么配置呢? CPU 多少核? 内存多大?

  3. 当时 tidb、pd、tikv 有没有相关的错误/警告日志呢? 如果有 请附带上,谢谢。

from docs-cn.

qinix avatar qinix commented on August 17, 2024

@holys

6 台 4 核 8G,硬盘 SSD

看到 TiDB 日志有大量的 tiki reports ServerIsBusy, reason: scheduler is busy,似乎大部分的错误集中在固定的几个 region_id 上。

tikv 上的日志是这样的:

2017/07/11 11:09:15.875 apply.rs:619: [INFO] [region 2644] 2646 execute admin command cmd_type: CompactLog compact_log {compact_index: 3996 compact_term: 6} at [term: 6, index: 3998]
2017/07/11 11:09:17.585 scheduler.rs:197: [WARN] [region 2152] scheduler handle command: batch_get, ts: 393148512555499521 [takes Duration { secs: 1, nanos: 203776507 }]
2017/07/11 11:09:17.795 scheduler.rs:197: [WARN] [region 2152] scheduler handle command: resolve_lock, ts: 393137059706961921 [takes Duration { secs: 39428, nanos: 42244721 }]
2017/07/11 11:09:20.948 scheduler.rs:197: [WARN] [region 2752] scheduler handle command: batch_get, ts: 393148513328824321 [takes Duration { secs: 1, nanos: 852540755 }]
2017/07/11 11:09:21.646 scheduler.rs:197: [WARN] [region 2712] scheduler handle command: resolve_lock, ts: 393136936905605121 [takes Duration { secs: 39427, nanos: 429143668 }]
2017/07/11 11:09:25.243 scheduler.rs:197: [WARN] [region 2504] scheduler handle command: resolve_lock, ts: 393136040871198721 [takes Duration { secs: 21665, nanos: 901228119 }]
2017/07/11 11:09:28.935 scheduler.rs:197: [WARN] [region 2] scheduler handle command: resolve_lock, ts: 393137281244856321 [takes Duration { secs: 39230, nanos: 861153552 }]
2017/07/11 11:09:29.464 scheduler.rs:197: [WARN] [region 2672] scheduler handle command: resolve_lock, ts: 393136831182929921 [takes Duration { secs: 39229, nanos: 144934985 }]
2017/07/11 11:09:32.433 scheduler.rs:197: [WARN] [region 2504] scheduler handle command: resolve_lock, ts: 393136040871198721 [takes Duration { secs: 21651, nanos: 798761757 }]
2017/07/11 11:09:36.035 scheduler.rs:197: [WARN] [region 2152] scheduler handle command: resolve_lock, ts: 393137059706961921 [takes Duration { secs: 39434, nanos: 438453427 }]
2017/07/11 11:09:39.950 scheduler.rs:197: [WARN] [region 2] scheduler handle command: resolve_lock, ts: 393137281244856321 [takes Duration { secs: 39220, nanos: 928453533 }]
2017/07/11 11:09:40.972 scheduler.rs:197: [WARN] [region 2752] scheduler handle command: batch_get, ts: 393148518597918721 [takes Duration { secs: 1, nanos: 710168763 }]
2017/07/11 11:09:43.066 apply.rs:619: [INFO] [region 2380] 2382 execute admin command cmd_type: CompactLog compact_log {compact_index: 7054 compact_term: 7} at [term: 7, index: 7057]
2017/07/11 11:09:43.690 scheduler.rs:197: [WARN] [region 2504] scheduler handle command: resolve_lock, ts: 393136040871198721 [takes Duration { secs: 21641, nanos: 619822940 }]
2017/07/11 11:09:44.018 scheduler.rs:197: [WARN] [region 2152] scheduler handle command: batch_get, ts: 393148519476101121 [takes Duration { secs: 1, nanos: 252688147 }]
2017/07/11 11:09:45.328 apply.rs:619: [INFO] [region 1956] 1958 execute admin command cmd_type: CompactLog compact_log {compact_index: 18683 compact_term: 7} at [term: 7, index: 18685]
2017/07/11 11:09:45.873 apply.rs:619: [INFO] [region 2272] 2274 execute admin command cmd_type: CompactLog compact_log {compact_index: 10150 compact_term: 6} at [term: 6, index: 10152]
2017/07/11 11:09:45.876 apply.rs:619: [INFO] [region 2296] 2298 execute admin command cmd_type: CompactLog compact_log {compact_index: 11078 compact_term: 8} at [term: 8, index: 11080]
2017/07/11 11:09:47.052 scheduler.rs:197: [WARN] [region 2752] scheduler handle command: batch_get, ts: 393148520183889921 [takes Duration { secs: 1, nanos: 692004783 }]
2017/07/11 11:09:47.574 scheduler.rs:197: [WARN] [region 2152] scheduler handle command: resolve_lock, ts: 393137059706961921 [takes Duration { secs: 39445, nanos: 753905515 }]
2017/07/11 11:09:51.175 scheduler.rs:197: [WARN] [region 2712] scheduler handle command: resolve_lock, ts: 393136936905605121 [takes Duration { secs: 39448, nanos: 260540361 }]

另外 loader 出现大量错误

2017/07/11 11:10:33 db.go:147: [error] exec sqls[[***]] commit failed Error 1105: [try again later]: backoffer.maxSleep 15000ms is exceeded, errors:
server is busy, ctx: region_id:2 region_epoch:<conf_ver:3 version:67 > peer:<id:6 store_id:4 >
server is busy, ctx: region_id:2 region_epoch:<conf_ver:3 version:67 > peer:<id:6 store_id:4 >
server is busy, ctx: region_id:2 region_epoch:<conf_ver:3 version:67 > peer:<id:6 store_id:4 >

2017/07/11 11:10:33 db.go:67: [warning] exec sql retry 4 - [***]

还出现过因为多次 retry 失败后 loader 进程退出。

from docs-cn.

holys avatar holys commented on August 17, 2024

@zhangjinpeng1987 PTAL

from docs-cn.

qinix avatar qinix commented on August 17, 2024

另外发现三台 TiKV 服务器压力不平均,两台几乎没有负载,另一台压力很大,大量的警告日志也只出现在那台压力很大的服务器上

from docs-cn.

holys avatar holys commented on August 17, 2024

@qinix 能否提供相关的 grafana 截图?

from docs-cn.

qinix avatar qinix commented on August 17, 2024

@holys

2017-07-11 11 49 16
2017-07-11 12 03 05

各种指标有点多,你看看具体需要哪些部分的截图

from docs-cn.

darren avatar darren commented on August 17, 2024

@qinix

你的tikv是怎么配置的? 用的默认配置?

我们测试的环境用默认的tikv的默认配置,开始导入数据, 结果一个多月只导入了1TiB左右的数据。速度完全没法接受。日志也是被ServerIsBusy塞满

后来修改了tikv的配置,把raftstore的sync-log改为false,重新导入数据,速度终于稍微能接受一些了,目前大概两个星期导入了6TiB的数据,不过我们用的都是机械硬盘。

from docs-cn.

iroi44 avatar iroi44 commented on August 17, 2024

@darren 感谢关注和使用 TiDB,我们很关注你的问题,可以给 [email protected] 发邮件同我们建立联系

from docs-cn.

jiangnanora avatar jiangnanora commented on August 17, 2024

我们也遇到了同样的问题,一共6台服务器,tikv都是3.2的pcie卡,内存256,lcpu 40,当在tidb上跑load的时候,其他的应用的写,基本没法做了都积压了,理论我压测下来的,性能都很高,不知道为什么会这样,最后还报错了2018/07/03 07:27:56 loader.go:124: [fatal] Error 9003: TiKV server is busy[try again later]

from docs-cn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.