Git Product home page Git Product logo

shulietech / takin Goto Github PK

View Code? Open in Web Editor NEW
1.3K 44.0 329.0 11.22 MB

Takin is an Java-based, open-source system designed to measure online environmental performance test for full-links, Especially for microservices. Through Takin, middlewares and applications can identify real online traffic and test traffic, ensure that they enter the right databases.

License: Apache License 2.0

performance-testing performance-analysis takin

takin's Introduction

Takin

LICENSE Language

English / 中文

What is Takin?

Takin is an Java-based, open-source system designed to measure online or test environmental performance test for full-links, Especially for microservices. Through ArchGuadian, middlewares and applications can identify real online traffic and test traffic, ensure that they enter the right databases.

Why should we do online environmental performance test

Microservices Architecture is used commonly nowadays and it always make system complex to understand for humans. Moreover, businesses are also very complex in huge system. Business complexity and system complexity make it difficult to :

  • Keep entire system highly available
  • Maintain Research & Development efficiency.

In order to keep system high available, we usually make performance test on test environment or online single-service. However, test environment is very different from online environment, single-service can't stand for the whole service-links. They can't guarantee system performance.

Microservices Are Complex
Compare with monolithic application, Microservices architecture increases complexity for business system. It may maintain multiple tools and frameworks.

Business Systems Are Complex
Businesses involve different sections and many of them are long-process and complicated, such as E-Commerce businesses.

The Microservices Relation Is Complex
In a microservices architecture system with a lot of business services, the calling relation between services is very complicated. Every change may affect the availability of the entire system and make developers difficult to release new versions Frequently.

Quick Start Instruction

docker:

  • VM memory requirement : 8G
  • Docker mirror size : 2.1 G

If docker configuration doesn't set AliYun docker source :

vim /etc/docker/daemon.json

Add following configuration:

{
  "registry-mirrors": ["https://q2gr04ke.mirror.aliyuncs.com"]
}

restart service

systemctl daemon-reload

Pull docker

# docker url : registry.cn-hangzhou.aliyuncs.com/shulie-takin/takin:v1.0.0
docker pull registry.cn-hangzhou.aliyuncs.com/shulie-takin/takin:v1.0.1
docker run -e APPIP=your ip address -p 80:80 -p 2181:2181 -p 29900-29999:29900-29999 registry.cn-hangzhou.aliyuncs.com/shulie-takin/takin:v1.0.1
  • Parameter:-d start in background,-p port.
    The Initiation of docker need about 10 mins because it need install necessary components. -d can ignore installment information of components in background. If you dont't want to open your server's port, you can use --net=host and make sure it and host server are in the same network。

  • Open http://APPIP/web

  • PS:If Nginx shows 502, the problem mostly is caused when the docker container has just been started, you only need to configure it correctly, and then wait a little (1-2 min) while to refresh and try again.

after installation:

Instruction

Takin Architecture


Takin consists of Agent, Web App and Surge Data.

Agent

Surge Data

Takin Web

Takin Engine

Community

Mailing List: Mail to [email protected]
Wechat group


QQ group: **118098566**
QR code:


Dingding group:


WeChat Official Account:


Ask Questions in Official Forum

Official Forum

Who use Takin

image

License

Takin is under the Apache 2.0 license. See the LICENSE file for details.

takin's People

Contributors

angjulin avatar chzhh393 avatar haikor avatar hezhongqi avatar huashen avatar iengrave avatar pirateskipper avatar shulietech avatar vinzhangya avatar wwclb1989 avatar xhb7636553 avatar yzhou452 avatar zhang19970916 avatar zhangya78 avatar zhangz-2021 avatar zhaofb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

takin's Issues

skywalking的agent兼容规划

目前应用已经安装了skywalking的agent,并且性能损耗在10%,数列的agent虽然说性能损耗能调优到5%以下,但安装两个agent性能,兼容性还是会有一些问题,所以能一个agent是最好的,那么有几种兼容方式,数列的agent兼容skywalking的协议,数列的产品功能兼容skywalking的核心功能,想要了解这块的后续规划

代码分支规范谈论

1、关于版本分支是否需要存活?
版本分支:为每一个发布,建一个版本分支,命名为版本号

在开启压测任务的时候,对监听器方法排序的时候,比较器内的compare方法出错误

、、、
public void doEvents(Event event) {
Map<String,ListenerContainer.Listener> map = listenerContainer.getListeners().get(event.getEventName());
List<ListenerContainer.Listener> list = new ArrayList(map.values());
Collections.sort(list, new Comparator<ListenerContainer.Listener>() {
@OverRide
public int compare(ListenerContainer.Listener o1, ListenerContainer.Listener o2) {
return o1.getIntrestFor().order() > o1.getIntrestFor().order() ? 1 : -1;
}

        @Override
        public boolean equals(Object obj) {
            return false;
        }
    });
    for (ListenerContainer.Listener entry : list) {
        try {
            entry.getMethod().invoke(entry.getObject(), event);
        } catch (IllegalAccessException e) {
            e.printStackTrace();
        } catch (InvocationTargetException e) {
            e.printStackTrace();
        }
    }
}

、、、

巡检优化

1.添加技术节点后,如果有多个上游与之关联;由于节点之间线会非常多,这样大屏展示出来就非常乱
image

2.目前巡检任务配置多个业务活动关联关系时,需要配置多个巡检场景,其中有关联关系的技术节点需要重复配置,操作比较繁琐
image

takin的一些建议

skywalking用slack、gitter我觉得takin也可以参考一下,增加开发者之间的交流,并且可以有一些沟通存档
在代码格式化话上,可以有一个统一的code style
在数列官网上关于开源的内容相对较少,可以有一个开源的官网
产出针对测试人员的takin最小化版本(只有压测相关功能可以线下压测),面向生产环境的全量版本
amdb数据处理只依赖数据库
探针设计文档以及插件的二次开发文档样例,以及插件开发的思路

Who is using Takin?(欢迎使用Takin的个人或者公司在此留言)

#谁在关注和使用Takin?

感谢关注和使用Takin的开发者和用户,大家的使用会给我们更大的鼓励。我们会持续的投入,让Takin的项目和社区更加繁荣,给开发者和用户更好的体验。

#为什么会有这个issue?

  • 更多的了解Takin的真实使用场景,以便后续的版本规划
  • 吸引更多的开发者参与到项目建设中

#我们期待您为Takin社区提供

在此提交一条评论,评论内容包括:

  • 您所在公司、学校或组织
  • 您所在的城市、国家
  • 您的联系方式: 邮箱、微信 (至少一个)
  • 您将 Takin 用于哪些业务场景

可以参考以下实例:

公司:数列科技
地点:**杭州
联系方式:[email protected]
使用场景:作为公司的生产压测平台,为公司活动提供准确的系统容量评估

import to idea error

<dependency>
            <groupId>io.shulie.amdb</groupId>
            <artifactId>amdb-common</artifactId>
            <version>1.0-SNAPSHOT</version>
</dependency>

is missing

前端角度提出的几点建议

前端使用的依赖版本太旧了,维护起来会越来越艰难,建议升级

一些依赖使用的还是两三年前的版本,需要引入其他依赖的时候有可能会引发不兼容,修复 bug 可能都找不到文档了,

依赖 使用版本 最新版本 备注
react 16.8.6 17.0.2 核心依赖
umi 2.13.7 3.5.20 脚手架
antd 3.26.13 4.16.13 ui 组件库
racc 0.4.5 0.4.5 前同事基于 antd@3 封装的组件,已无人维护,应考虑移除

UI 整体风格比较陈旧,建议重新设计

仓库管理

前端仓库建议还是独立维护,核心仓库文档加上子项目的仓库地址,可以考虑通过 github actions 自动化的方式将前端构建后的文件跟后端合并,这样用户 pull 下来可以直接运行,需要看源码再去前端仓库查看

形成规范,并完善文档,尽量使用自动化方式处理问题

比如版本号的规范比较通用的是语义化版本
自动打包,自动生成changelog等等

前端升级重构的话工作量还是非常大的

Dependency org.apache.zookeeper:zookeeper, leading to CVE problem

Hi, In Takin/takin-data/surge-data/common,there is a dependency org.apache.zookeeper:zookeeper:3.4.9 that calls the risk method.

CVE-2019-0201

The scope of this CVE affected version is [11.0, 24.1.1-android),(24.1.1-android, 24.1.1-jre)

After further analysis, in this project, the main Api called is <org.apache.zookeeper.server.FinalRequestProcessor: void processRequest(org.apache.zookeeper.server.Request)>

Risk method repair link : GitHub

CVE Bug Invocation Path--

Path Length : 4

<org.apache.zookeeper.server.FinalRequestProcessor: void processRequest(org.apache.zookeeper.server.Request)>
at <org.apache.zookeeper.server.quorum.CommitProcessor: void run()> (org.apache.zookeeper.server.quorum.CommitProcessor.java:[77]) in /.m2/repository/org/apache/zookeeper/zookeeper/3.4.9/zookeeper-3.4.9.jar
at <io.shulie.surge.data.common.zk.impl.CuratorZkPathChildrenCache: void setNewData(java.util.List)> (io.shulie.surge.data.common.zk.impl.CuratorZkPathChildrenCache.java:[185, 187]) in /detect/unzip/Takin-1.0.1/takin-data/surge-data/common/target/classes
at <io.shulie.surge.data.common.zk.impl.CuratorZkPathChildrenCache: void access$1000(io.shulie.surge.data.common.zk.impl.CuratorZkPathChildrenCache,java.util.List)> (io.shulie.surge.data.common.zk.impl.CuratorZkPathChildrenCache.java:[49]) in /detect/unzip/Takin-1.0.1/takin-data/surge-data/common/target/classes

Dependency tree--

[INFO] io.shulie.surge.data:common:jar:1.0
[INFO] +- ch.qos.logback:logback-classic:jar:1.2.3:compile
[INFO] |  +- ch.qos.logback:logback-core:jar:1.2.3:compile
[INFO] |  \- org.slf4j:slf4j-api:jar:1.7.25:compile
[INFO] +- com.github.stephenc.high-scale-lib:high-scale-lib:jar:1.1.4:compile
[INFO] +- com.github.sgroschupf:zkclient:jar:0.1:compile
[INFO] +- commons-collections:commons-collections:jar:3.2.2:compile
[INFO] +- org.apache.zookeeper:zookeeper:jar:3.4.9:compile
[INFO] |  +- log4j:log4j:jar:1.2.16:compile
[INFO] |  +- jline:jline:jar:0.9.94:compile
[INFO] |  \- io.netty:netty:jar:3.10.5.Final:compile
[INFO] +- commons-codec:commons-codec:jar:1.6:compile
[INFO] +- com.alibaba:fastjson:jar:1.2.72:compile
[INFO] +- com.netflix.curator:curator-framework:jar:1.3.3:compile
[INFO] |  \- com.netflix.curator:curator-client:jar:1.3.3:compile
[INFO] +- com.netflix.curator:curator-recipes:jar:1.3.3:compile
[INFO] +- org.apache.commons:commons-lang3:jar:3.11:compile
[INFO] +- commons-io:commons-io:jar:1.3.2:compile
[INFO] \- com.google.guava:guava:jar:15.0:compile

Suggested solutions:

Update dependency version

Thank you very much.

docker安装部分内容补充建议

目前 docker 安装部分,进入容器 只有一段注释,没有配套的命令,可能部分对 docker 不熟悉的同学会误以为第一条命令就是进入容器。
image

建议:
1、docker run 命令里加上 --name takin 参数,指定容器名称
2、下方的文档,加上进入容器命令(如 docker exec -it takin sh

密码脱敏

密码脱敏最好使用 git filter-branch 重写历史,并且 git push --force 推送至远端。否则,历史记录中还是会有真实的密码。

874799a

支持租户隔离

1、租户隔离彻底性隔离,包括应用配置、redis、zk等等

统计报表分析功能

1、关于压测统计,以及分析功能,以多次压测结果为标准,不能一次压测结果作为结果
2、关于页面的UV、PV指标统计

压测流量不阻断

接入调试更简单,新增一种调试的方式,尽量让压测流量忘下走,需要配置的信息不以异常的方式告知,以免阻断流量的调试
,因为只要抛异常,就直接阻断,对使用者的也有要求,比如数据库如果没有
建表权限,比如白名单需要了解这个接口是否可以配置,直接把这些信息类似于链路的方式上报上来

数据库不直接报错,上报了执行的sql语句上报,
白名单也直接通过,上报了走过的白名单的数据
mq支持自动创建影子topic,通过api的方式

代码重复,没有必要重复赋值

、、、
private void notifyTaskResult(ScheduleRunRequest request) {
SceneTaskNotifyParam notify = new SceneTaskNotifyParam();
notify.setSceneId(request.getRequest().getSceneId());
notify.setTaskId(request.getRequest().getTaskId());
notify.setCustomerId(request.getRequest().getCustomerId());
notify.setCustomerId(request.getRequest().getCustomerId());
notify.setStatus("started");
sceneTaskService.taskResultNotify(notify);
}
、、、

压测调试功能太弱,使用难度很大

1、压测接入调试过程中没有详细的技术文档指导接入压测

2、产品使用难度还是很高如何

  • 接入探针报错如何处理

  • 压测调试的时候如何铺数

  • 如果有中间件不支持如何识别

支持混合场景压测

  1. 不同场景需要不同的施压配置,如并发数,tps目标等,需要用多线程组来支持
  2. 兼容用户现有的jmeter脚本,减少用户重新编辑脚本的工作
  3. 对数据报告能支持线程组和罗静控制器为维度的数据统计
  4. 检测用户文件是否变更,并做响应处理

集成开发流水线

在日常环境中使用 Takin 大部场景都会和 DevOps 做整合集成到开发流水线上,现在没有专用的 API 和教程可以支撑用户快速集成现有的流水线

rabbitmq隔离问题

目前无法完美支持fanout的exchang类型,并且配置方式复杂,需要变更实现方式为通过rabbitmq admin api获取consumer信息的方式

增加单元测试的覆盖度

目前各个项目中,没有测试模块,修改代码后验证改动正确性、是否对其他模块有影响,没有测试的基础。
外部开发者、内部测试协同和代码合并场景,按照规范提交的代码必须带有相对完善的单元测试用例
需要在后续的版本中逐步增加单元测试的覆盖度

运行docker命令有报错

按照说明执行一下docker命令有报错:
docker run -d -p 80:80 -p 2181:2181 -p 3306:3306 -p 6379:6379 -p 8086:8086 -p 9000:9000 -p 10032:10032 -p 6628:6628 -p 8000:8000 -p 6627:6627 -p 8888:8888 -p 29900-29999:29900-29999 registry.cn-hangzhou.aliyuncs.com/forcecop/forcecop:v1.0.0

报错部分log:
Archive: /data/apps.zip

replace apps/tro-web/trodb_web_base.sql? [y]es, [n]o, [A]ll, [N]one, [r]ename: NULL

(EOF or read error, treating as "[N]one" ...)

apps_install.sh: line 4: /usr/local/nginx/conf/nginx.conf: No such file or directory

pid not found

apps_install.sh: line 80: nginx: command not found

/usr/bin/tail: inotify cannot be used, reverting to polling: Function not implemented

压测目标为总体 tps

现在的压测目标是单独设置压测的 tps,但是对于用户来说只知道当前的系统总共的 tps 应该在多少是安全的,所以建议将压测目标的 tps 设置成应用总的 tps,至于压测 tps 到底是多少则 将设置的目标 tps减去当前业务的 tps,这个可以保证在哪个时间段压测系统的总 tps 都是在期望之内,避免产生额外风险

Travis CI

it is recommended to add travis ci to this project and keep it the same as other open source projects.Thanks

对takin的一些建议

1,架构支持saas服务,支持租户,用户组织,支持行级数据权限(且往后传递,amdb等)
2,大数据和agent的数据安全问题相关规范及标准(agent的收集上报,及大数据的存储分析和敏感数据的识别等)
3,大数据的计算分析能提供更多的行业场景(如,根据特定行业背景分析出更多有价值的数据,技术架构支持能力,按租户区分,按配置运行不通分析任务),Lic授权式加载,开放插件市场,吸引更多商业插件开发加入
4,数据核对工具,支持快速验证数据,因为takin的数据产生和加工过程很长(提高交付排障能力)
5,公众号运营起来,租户下的用户可以绑定账号,把告警和提醒通过公众号连接起来

调试简单、模块加载优化

接入调试更简单,新增一种调试的方式,尽量让压测流量忘下走,需要配置的信息不以异常的方式告知,以免阻断流量的调试
,因为只要抛异常,就直接阻断,对使用者的也有要求,比如数据库如果没有
建表权限,比如白名单需要了解这个接口是否可以配置,直接把这些信息类似于链路的方式上报上来

数据库不直接报错,上报了执行的sql语句上报,
白名单也直接通过,上报了走过的白名单的数据
mq支持自动创建影子topic,通过api的方式

bug优化,安全模块:
部分模块同步加载,当前探针模块异步加载,在应用启动和模块加载成功中间会有一个时间差,这时候不在控制台发起压测流量,可能会出现数据隔离的功能还未加载成功,导致数据数据的问题

链路依赖的数据库表想要直观看到哪些表,这些表的读写情况

有些公司的业务是有读写分离的,有些链路负责读,有些链路负责写(比如后台配置),在配置影子表的时候,不想配置那些读请求的表,所以希望梳理出来这个链路涉及哪些表,哪些是读,哪些是写,给到DBA建写的影子表,然后只要在页面勾选配置写的表就可以了

delete影子库/表报error

删除、禁用影子库/表,报错信息:{"error":{"code":"0000_0000_0000","msg":"syntax error, expect {, actual EOF, pos 0","solution":"请联系管理员处理"},"data":null,"totalNum":null,"success":false}

压测结束性能分析很难用需要优化

压测结束后系统性能瓶颈分析目前手段还是很少

  1. 数据库瓶颈如何发现?

  2. 网关性能问题该怎么定位?

  3. 链路追踪展示的逻辑不太能看得明白

  • 目前压测实况展示了所有的链路调用耗时,其实没有必要(用户只关心压测时候耗时比较长的链路调用)
  • 链路调用耗时展示的明细不够清晰(主要体现在响应时间跟链路里面的耗时对不上,链路耗时ui展示的看不出重点)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.