Git Product home page Git Product logo

owl's Introduction

OWL

Go Report Card License

​ ​ ​ ​ ​ ​ OWL 是由国内领先的第三方数据智能服务商 TalkingData 开源的一款企业级分布式监控告警系统,目前由 Tech Operation Team 持续开发更新维护。

      OWL 后台组件全部使用 Go 语言开发,Go 语言是 Google 开发的一种静态强类型、编译型、并发型,并具有垃圾回收功能的编程语言,它的并发机制可以充分利用多核,同平台一次编译可以到处运行,运维成本极低,更多的信息可以参考官方文档。前端页面使用 iView 开发,iView 同样是由 TalkingData 开源的一套基于 Vue.js 的 UI 组件库,主要服务于 PC 界面的中后台产品。

Features

  • Go语言开发,部署维护简单
  • 分布式,支持多机房
  • 多维的数据模型,类opentsdb
  • 支持多种报警算法,支持多条件组合、时间范围、报警模板等
  • 灵活的插件机制,支持任意语言编写,支持传参,自动同步到客户端
  • 丰富的报警渠道,邮件、企业微信、短信、电话以及自定义脚本
  • 原始数据永久存储,支持发送到 opentsdb、kairosdb、kafka
  • 自带 web 管理界面以及强大的自定义图表功能能

Architecture

owl

Demo

demo

Components

agent:安装在每台被监控机器上,用于采集监控数据

netcollect:通过 SNMP V2 采集网络设备的接口数据

repeater:接收 agent 发送过来的监控数据,并写入后端存储

cfc:维护客户端需要执行的插件列表,主机名 、ip地址更新以及采集到的指标列表

controller:从数据库加载告警策略,生成任务发送给 inspector,并且根据执行结果进行告警

inspector:从 controller 获取监控任务,根据 tsdb 中的数据进行计算,并将结果返回 controller

api:对外提供 http rest api接口,web 页面就是通过它来获取数据

MySQL:所有配置信息的持久化存储,包含主机信息,告警策略,主机组,人员等

TSDB:时序数据库(time seires database),用于存储采集到的监控数据

frontend:web 管理页面,可以方便的进行系统管理维护工作

前端源码地址

https://github.com/TalkingData/owl-frontend

owl's People

Contributors

aidenpan0x avatar godliness avatar hrsjw1 avatar leviathan1995 avatar liuts avatar qweasgw123 avatar wuyingsong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

owl's Issues

自定义服务的功能

owl的k8sagent输入到owl中的数据为:
{ "metric": "k8s.restart.count", "data_type": "counter", "value": 0, "tags": { "container":"fronted-prod-v1", "namespace":"web-ns", "port": "3306", pod="fronted-prod-v1-7d7fd69755-62dlh" } }

我希望可以根据其中的任意的标签定义一个服务,比如namespace和container。那么在TSDB中的所有的包含这个标签的metrcs都属于这个服务的监控项目。
报警:可以根据自定义的tsdb查询语句报警,比如k8s.restart.count携带定义服务时候的标签,能够对查询到的多个结果报警

inspector 配置文件worker_count 的问题。

我问下inspector 配置文件里面的worker_count 是不是inspector判断告警的时间间隔,我改成2之后,inspector没有2分钟就执行一次告警策略计算。还是5分钟执行一次策略计算。

产品Roadmap

楼主:感谢分享。
请问产品RoadMap是否能提供下?
谢谢

报警处于“活跃” ,”知悉“,”恢复“的切换

报警处于“活跃” 状态,我点 “知悉”1小时,开始处理报警,5分钟内处理完成了, 报警的状态 不能变为“恢复”状态。一般处理流程是,收到微信报警,点击知晓不在继续报警,登录机器开始处理,处理完成以后,等恢复的微信。

Time series queue support persistent storage

In current situation,when the repeater receivers the time series data sent by the owl client,it will be stored in go channel。if repeater process crashes at this time,it will lost all time series data。therefore,we want to use a persistent queue to ensure that data is not lost.

插件数量显示异常

问题描述,复现步骤:
1,删除 ALL-HOST主机组中所有主机的插件,和主机组的插件
2,给 ALL-HOST主机组加一个插件,现在主机组的插件1
3,检查主机的插件数量一直显示0

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.