hitsz-ids / duetector Goto Github PK

View Code? Open in Web Editor NEW

10.0 2.0 8.0 2 MB

duetector🔍: Data Usage Extensible Detector for data usage observability.

Home Page: https://dataucon.idslab.io/

License: Apache License 2.0

Dockerfile 0.34% Python 99.30% Shell 0.36%

bcc data-usage ebpf kata-containers observability

duetector's People

Contributors

Stargazers

Watchers

Forkers

wunder957 wyxsb suhastj30 zhemulin aklly jszama ninzeige guo-yunzhe

duetector's Issues

Deploy readthedocs

Currently it is difficult to access class documentation, and it is best to deploy readthedocs for developers to access the documentation.

I'm currently focused on feature development and may not have time to prioritize this ahead of time. Feel free to make a PR.

Cookiecutter for extensions

Cookiecutter is a Python package, easily installable with pip or other package managers, that enables you to create and use templates for microservices and software projects.

🚅Search before asking

I have searched for issues similar to this one.

🚅Description

Reference examples/extension to build cookiecutter templates.

🏕Solution(optional)

A cookiecutter templates, users can choose which plugin to create
Provide a way to test it
(Nice to have)Testing it using Github Actions.

Add a README to examples/extension to tell people that there are cookiecutter templates available.

🍰Detail(optional)

(Cookiecutter-template) Please fork and draft PR on this project: https://github.com/hitsz-ids/duetector-cookiecutter.

🍰Example(optional)

https://github.com/Wh1isper/pypi-hatch-pytest-cookiecutter

🚅Search before asking(optional)

I have searched for issues similar to this one.

🚅Description(optional)

Prior to the official release, we should perform some benchmarks of performance(and others) to clarify the non-functional requirements for the next phase.

🏕Solution(optional)

These are some of the things that may need to be done:

Design a workload so as to represent common usage scenarios
Design environments for benchmarks, e.g. high load environments
Testing, Collecting and Plotting
Analyze the data

🍰Detail(optional)

We have @WYXsb to help us on this. Also looking forward to any relevant proposal and help.

Thanks!

🚅Search before asking

I have searched for issues similar to this one.

🚅Description

Support this draft: #84

Report: https://github.com/hitsz-ids/duetector/blob/main/docs/draft/case-mnist/report.md

🏕Solution(optional)

Add tracepoint or kprobe mentioned in the report

🍰Detail(optional)

You need to understand chinese as the draft(report) is written in chinese.

🍰Example(optional)

🚅Search before asking

I have searched for issues similar to this one.

🚅Description

OpenTelemetry is an Observability framework

The tentative plan is to introduce it around v0.2.0.

🏕Solution(optional)

I think we can use it to replace the current tracking implementation as well as implement a generic collector

🍰Detail(optional)

🍰Example(optional)

A Production case of using duetector

We plan to demonstrate the capabilities of this project by having a production-level case run-through by the time of the 0.1.0 release. There is currently a draft, see tracking-mljob-in-kata-containers.

For now, we have the simplest case open count, based on the experience of writing this case, I think we still have the following to accomplish:

Analyze what are the traceable points of a data analysis task(kprobe, tracepoint, etc.), with this we will get a report, or flowchart
Implement the relevant tracer based on the above report
Manually analyzing trace data as a way to explore analyzer implementations

Any refinements to this draft or other use cases are encouraged!

PIP Server Implementation

Document translation

🚅Search before asking

I have searched for issues similar to this one.

🚅Description

We currently provide users with documentation in English and are continuously updating it.

If people can help us with translations or even contribute to translations on an ongoing basis (I know it's a bit early to say), it would be beneficial to advance the promotion of this project!

Docs here: https://github.com/hitsz-ids/duetector/tree/main/docs

🏕Solution(optional)

Provide a translated version of the document with a _zh suffix
Adding links to English version and Chinese version in documents.(see Example)

🍰Detail(optional)

Please note source/ is not included in the translation.

🍰Example(optional)

This is an example of a translated README.md document

<p align="center">
 <a href="./README.md">English</a> | <a href="./README_zh.md">中文</a>
</p>

Build PIP Server

Policy Information Point (PIP): Serves as the retrieval source of attributes, or the data required for policy evaluation to provide the information needed by the PDP to make the decisions. NIST SP 800-162.

In version 0.1.0 we will provide the base PIP service, which should not be limited to the existing tracer and collector implementations, and I think we should introduce an indirection layer to isolate the PIP service from the existing trace/collection mechanisms. And maybe this indirection layer is what's called an analyzer.

More tracers

Bring CO-RE into this project

🚅Search before asking

I have searched for issues similar to this one.

🚅Description

As bcc is not a CO-RE framework of eBPF, we need to rely on other framework like libbpf, aya-rs, cilium/ebpf, which are not written in Python. On the other hand, there may be other, non-eBPF, tracing programs that run as separate processes.

🏕Solution(optional)

I think we can introduce a set of mechanisms for sub-processes as a way to achieve integration with other detectors.

Note that we have currently implemented monitor for shell command called ShMonitor and a process daemon Daemon.

🍰Detail(optional)

We still have the following to move forward:

Designing protocols to interact with processes(Basiclly stdout)
Implement a buffered subprocess monitor according to the protocol, mostly SubprocessMonitor
Provide integration method, mostly SubprocessTracer class

Not sure this is beneficial or could benefit from #25.

🍰Example(optional)

See draft: #44

PIP Server Framework Initialization

Migration BccTracer to CO-RE(SubrpocessTracer)

🚅Search before asking

I have searched for issues similar to this one.

🚅Description

Now that we have SubprocessMonitor in #44 , we can migrate the existing BccTracer to CO-RE, using Subprocess proto

🏕Solution(optional)

We're going to have multiple PRs for this.

#80
Migrate OpenTracer to CO-RE
Migrate TcpconnectTracer to CO-RE
Support CO-RE compile and build distribution

🍰Detail(optional)

See design: https://github.com/hitsz-ids/duetector/blob/main/docs/design/CO-RE.md

🍰Example(optional)

0.0.1

Main Features：

bcc monitor
memory collector
by-pass filter

Unittest:

Mock bcc monitor

PR(Will merge into main)：

Add read and write tracers

🚅Search before asking

We needs many tracers using eBPF to track operations on data. There is no tracers to trace read and write operations on files. So we need read and write tracers.

🚅Description

Add new read and write bcctracer just like other tracers here.

🏕Solution

1.See our Developer Manual
2.Look at implentation of other tracers.
3.Fork our project and use BCC to write bcctracers under our framework and test them locally.
4.Create a PR.

Update usercases with `analyzer`

Migrate `CloneTracer` to CO-RE

We currently use bcc as our BPF framework, which creates some shortcomings: https://github.com/hitsz-ids/duetector/blob/main/docs/design/CO-RE.md#12-status-quo

This issue will migrate the current CloneTracer to CO-RE form to validate our draft and provide a case for CO-RE!

如何解决用数方对部署DataUCon的担忧

duetector version:
Python version:
Operating System:

Description

您好, 个人理解这个项目主要是解决了数据提供方对提供数据给用数方之后, 用数方可能存在对数据滥用和泄露的问题, 从方案上看应该是可行的. 但是使用这种技术, 对用数方的环境是否有很大的侵入, 怎么解决用数方对部署该系统的安全担忧?
一方面, 如果该系统完全开源, 由用数方自行部署, 怎么确定用数方不会擅自修改代码改变了逻辑?
另一方面, 如果系统是个黑盒, 用数方怎样信任部署该系统? 因为该系统能够监听所有流量, 实际上拥有了很高级别的权限, 用数方怎样信任该系统不会盗取用数方的数据成为数据提供方的特洛伊木马? 因为数据交易流通场景下, 更多的是相互共享数据的诉求, 而不是单方向的提供数据.

What I Did

我翻阅了下这里提供的文档链接, 但没有找到更详细的说明, 因此通过这个方式请教一下. 谢谢.

Proposal: intro analyzer for trackings

[Good First Issue] Guide for New Contributors

In order to get more people into the program faster, we came up with this Issue. Thank you for clicking on this issue and thinking about how you can contribute to our project. 🎉

🍰 What is Good First Issue?

Good First Issues empowers first-time contributors of open-source software.

We have three levels of difficulty for you:

🚅 How to pick a task?

Participation in a project that has already been assigned is also encouraged, but you must be aware that it will require more effort to get up.

👀 Browse the issues you are interested in
🙋‍ Comment below the issue you want to participate in
📧 After further communication with the maintainer, issue will be assigned to you
🔧 A Maintainer will be your mentor and will continue to work with you on issue resolution
🚅 Read our contributing guidelines and commit code to your fork.
✔ After PR, Code review, Request changes, Approve..., your code will be merged into the master branch and you will become one of our developers!

❓ Need help?

When you have some problems and want to ask for help:

Ping maintainer directly below the issue
Send us an email: [email protected]

⭐New ideas?

If you have any new ideas, including Feature Requests, feel free to raise an issue as a good first issue, and the maintainer will analyze the problem and evaluate the difficulty with you.

👨‍🔧Technical support

There are currently two maintainers responsible for maintaining this project:

Update develop and other docs

Better Startup Script

We currently provide a very early startup script, which I hope will have the following functionality

Pass parameters for duectl-daemon start
Pass parameters for duectl-server-daemon start
Provide a user-defined script entry point to replace the current user application (JupyterLab), mostly exec one script, which can be defined by the environment variable

dependency issues when starting using duectl

Description

First, I installed it according to the command

pip install duetector

in the readme.md this step performed normally.

When using the

sudo duectl start

command to start, it prompts:

ModuleNotFoundError: No module named 'bcc'

Reproduce

run pip install duetector
run sudo duectl start

Expected behavior

It shows :

  File "/Users/user_name/opt/anaconda3/lib/python3.9/site-packages/duetector/monitors/bcc_monitor.py", line 66, in init
    from bcc import BPF  # noqa
ModuleNotFoundError: No module named 'bcc'
(base) ➜  git_DIR

Context

Operating System and version: MacOS 12.5（21G72）
Browser and version(if necessary): -
Which version are you using: python 3.9.12

Error message

Paste complete error message, logs, or stack traces here.

(base) ➜  git_DIR sudo duectl start
Password:
2023-11-03 10:38:37.271 | INFO     | duetector.config:generate_config:93 - Creating default config file /Users/user_name/.config/duetector/config.toml
2023-11-03 10:38:37.272 | INFO     | duetector.config:load_config:114 - Loading config from /Users/user_name/.config/duetector/config.toml
2023-11-03 10:38:37.273 | INFO     | duetector.config:load_env_config:145 - Loading config from environment variables, prefix: `DUETECTOR_`, sep: `__`
2023-11-03 10:38:37.274 | INFO     | duetector.config:dump_config:170 - Current config has been dumped to /private/tmp/duetector_config.toml.31588
2023-11-03 10:38:37.413 | INFO     | duetector.managers.collector:init:63 - Collector DequeCollector is disabled
Traceback (most recent call last):
  File "/Users/user_name/opt/anaconda3/bin/duectl", line 8, in <module>
    sys.exit(cli())
  File "/Users/user_name/opt/anaconda3/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Users/user_name/opt/anaconda3/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Users/user_name/opt/anaconda3/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/user_name/opt/anaconda3/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/user_name/opt/anaconda3/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Users/user_name/opt/anaconda3/lib/python3.9/site-packages/duetector/cli/main.py", line 159, in start
    monitors.append(BccMonitor(c))
  File "/Users/user_name/opt/anaconda3/lib/python3.9/site-packages/duetector/monitors/bcc_monitor.py", line 58, in __init__
    self.init()
  File "/Users/user_name/opt/anaconda3/lib/python3.9/site-packages/duetector/monitors/bcc_monitor.py", line 66, in init
    from bcc import BPF  # noqa
ModuleNotFoundError: No module named 'bcc'
(base) ➜  git_DIR

Configuration

Paste the contents of your configuration file here.

Additional context

Add any other context about the problem here.

Add a application user to dockerfile, and make it useable

Testing tools for BccTracer

🚅Search before asking

I have searched for issues similar to this one.

🚅Description

Currently we don't have a good way to test BccTracer(and its subclasses), the main difficulty is that there are no test suites and test methods with predictable results.

🏕Solution(optional)

I think we can simulate booting the kernel with a tool such as qemu and then perform a series of predictable operations to get predictable results.

🍰Detail(optional)

We already have experience related to compiling bcc images

🍰Example(optional)

TDB

Support filter or identifier for runc containers

🚅Search before asking

I have searched for issues similar to this one.

🚅Description

Currently, since the runc container shares kernel with each other and host, information about the host will be collected under the runc container

🏕Solution(optional)

we need a way to filter it to support cloud-native environments.

A duetector running inside a container should only report on the in-container process.

The duetector running on the host should be able to distinguish whether a process is coming from a container or not.

Reference:

https://github.com/cilium/cilium
https://github.com/deepflowio/deepflow

🍰Detail(optional)

Currently bcc does not provide cgroup related helper, a possible idea is to get the cgroup knid through task_struct :

task_struct->cgroups->subsys[CGROUP_SUBSYS_COUNT]->cgroup->kn->id.id

But at the same time we want to get more information for tracking and analysis. For example, a docker's cgrop:

0::/system.slice/docker-56f9992608a558ef5dbe28317de44f3459dd5968035e30508a3f1c160bb5744b.scope

56f9992608a558ef5dbe28317de44f3459dd5968035e30508a3f1c160bb5744b is container's id

Once we have the information about the process crgoup, we can clearly conclude whether the process is running in a runc container or not.

So we need to get more information from the user-space program. Or maybe we can get all the information we need from userspace alone, without relying on the ebpf

One possible way is cat /proc/{pid}/cgroup. However, given the current trigger mechanism of poller, we may not be able to get readability information for short-lived processes. ~~And once we've implement #44, front-end programs can then run with lower latency.~~

🍰Example(optional)

How to get cgroup path of task in an eBPF program?

[BOT] Add Contributors

@all-contributors
please add @wunder957 for code.
please add @WYXsb for code.

Support jaeger analyzer

Problem

Proposed Solution

Additional context

Version 0.0.2

In 0.0.2, We plan to bring the following functionality to achieve the goal of monitoring a particular machine learning task

We'll explore how the design ties together surveillance information, if you have any ideas, please feel free to share and discuss them with us!

Make analyzer pluggable and configurable in Server

Tracers: Support attatching multiple C functions to the BPF

In #28 we found that current tracer's API is not support attatching multiple C functions to the BPF which made it hard to tracking a connection/thread's life cycle

I think we cloud add attatch: List[Tuple[attatch_type, attatch_args]] to BccTracer as an advanced way to make it more flexible.

Version 0.1.1

New Features, will draft PR and bring important features:

Documents, will draft PR:

Survey, will draft proposal:

Support query injector attr by analyzer

In #103, we introduced Injector for more information of a process

Now we need to support query these in analyzer

How to work with OpenTelemetry

🚅Search before asking

I have searched for issues similar to this one.

🚅Description

We currently support OpenTelemetry's Collector(#82), We currently support OpenTelemetry's Collector, but there's no documentation on how to configure it yet

🏕Solution(optional)

I would like someone to help us by testing it on different telemetry systems and give us the configuration documentation and the results of the test.

otlp-grpc
otlp-http
jaeger-thrift (This is a compatibility option, as jaeger is already natively supported otlp)
jaeger-grpc (This is a compatibility option, as jaeger is already natively supported otlp)
zipkin-http
zipkin-json
prometheus

Please comment and let me know which telemetry system you would like to participate in so we can discuss this in more depth!

🍰Detail(optional)

🍰Example(optional)

EPIC: Towards Measurable Data Usage

Problems

This project was initially directed towards unplugged detection of data usage behaviour through eBPF technology. I'm glad we've initially implemented a framework for it. But want to make the probe results available to other applications (e.g. the Data Usage Controller of DataUcon project), we need to expose the results of our recording in a machine-readable format.

On the other hand, we need to finish standardising the storage side of things, and for large numbers of events, a traditional SQL database is not a good choice.

We don't yet have a good production example to represent our capabilities.

Status quo and Future

Relationship with storage back-end

OpenTelemetry is sought after by related projects as an open source standard for observability. We believe that although our project is far from observability in terms of observables, goals, and functions. However, our project is similar to OpenTelemetry related projects in terms of technical implementation, and we should be able to benefit from the development of OpenTelemtry and related backends.

As the project has evolved, we have completed the integration with OpenTelemetry: #82. Next, we will make OpenTelemetry our primary support, and SQL databases MAY NOT be actively maintained.

We are currently using jaeger as the first backend to access the.

Cloud-native support

We will natively support monitoring of containers on the cloud, so let's start with the docker and k8s.

How to expose data

We will first build a querier for the jaeger backend to restore the tracer data from the backend, and then implement an analytics engine that can form an analysis of the tracer data to derive a picture of how the process is using the data. We will refer to this process as the measurement of data usage

Designing and implementing an analysis engine
Docs: 4W1H of Data Usage and Our Measurement Capabilities

production example

We previously accepted a machine learning case for MNIST that included analysis and associated probing points for data usage behaviours: #84, and I thought we could start with this case to demonstrate our data usage measurement capabilities

#85
Support 4W1H Measurement of MNIST Case
Docs: 4W1H Measurement in MNIST Case

Other maintenance

Instead of (at least not in the near future) splitting the project into a queryer and a detector, we'll build two different images based on the same Python package(duetector). We already have a different CLI entry point, so I'm sure this won't be difficult.

In addition, we need to optimise the README document and the design document a bit, assuming the backend to be OpenTelemetry

Splitting container images
Docs: Switch to OTel backend

Roadmap

This EPIC will be released as version 1.0.0, prior to which the features described above will be integrated as version 0.x.y and in a gradual development process.

Regarding data use measurability, I am working on some related blogs (in Chinese).

[BOT] Add Contributors

@all-contributors
please add @wunder957 for code.
please add @WYXsb for code.