bugout-dev / humbug Goto Github PK

Get usage metrics and crash reports for your API, library, or command line tool.

License: Apache License 2.0

Python 60.12% Shell 3.18% Go 13.48% TypeScript 7.16% Java 13.72% Clojure 1.73% Kotlin 0.60%

crash-reporting python analytics usage-reports usage golang go javascript crash-reports library

humbug's Introduction

humbug

Humbug helps you understand what keeps users coming back to your developer tool as well as any friction they experience.

Humbug lets you collect basic system information and crash reports while respecting your users' privacy. In addition to getting reports, you get to be GDPR-compliant from day one.

Humbug is currently available in the following programming languages:

Python
- System information report
- Error traceback report
- Packages available in the current Python process report
- Logs report
- Environment variables report
- Custom report with full content control
Go
- System information report
- Panic report
- Custom report with full content control
Javascript
- System information report
- Error traceback report

If you would like support for another programming language, please create an issue.

Using Humbug

Setup

Follow the instructions in the Getting started with usage and crash reporting guide.

From development to production

We recommend generating one token for development and testing and using different tokens for each version of your production library or application.

Accessing reports

You can access your Bugout knowledge base at https://bugout.dev, via the Bugout API, or using the bugout command line tool.

Bugout client libraries:

The bugout command line tool can be installed from: https://github.com/bugout-dev/bugout-go/releases/latest

You can use humbug.bash to download your Humbug reports to your filesystem in an easy to analyze JSON format.

Getting help

You can get help by:

humbug's People

Contributors

Stargazers

Watchers

Forkers

zomglings kompotkot sophiaar andrei-dolgolev yhtiyar benjyw

humbug's Issues

Ability to store reports on filesystem, retrieve them from filesystem, and publish them with consent

This enables the common desktop app crash reporting semantics of prompting the user once a crash has occurred to check if they would like to report the issue back to the developer.

`pkg_resources` warnings on install

<redacted>/lib/python3.10/site-packages/pkg_resources/__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API
    warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)

<redacted>/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

System report should include "ci:<provider>" tags when running in CI/CD environments

Some of our customers build software that's intended to run in CI/CD pipelines. It is important for them to have a good sense of which CI/CD tools their users are using their software in.

Most of our customers would also like to be able to filter on events fired from CI/CD or filter them out of their analyses.

I propose that, when we generate system tags, we try to detect whether we are running in a CI/CD environment and add the (inferred) name of the CI/CD provider (e.g. GitHub Actions, CircleCI, TravisCI, etc.) as a tag of the form ci:<provider>.

Each CI/CD platform has certain environment variables that it uses to configure internal processes:

Note that all of them seem to set CI=true. So at the very least we should use that. We can also use the presence or absence of other environment variables to infer which specific CI provider we are working with.

Add call reports to HumbugReporter

These should be function/method decorators. They should allow the caller to specify that certain kinds of information get reported when a function is called:

That the function was called (with args + kwargs?)
If there was an exception raised by the function
When the function returns (with return value?)

Add ability of humbug-python use batch endpoint.

Create library method that wraps `HumbugReporter.Publish` in a goroutine

The function should have a signature like this:

func (reporter *HumbugReporter) PublishAsync(report Report) (func, <-bool) {
    ...
}

The return values should be:

A function with no arguments and no return which can be run as a goroutine. when run, this routine will publish the given report.
A channel to which the go routine will write true when publication is complete.

Instead of making this a method on HumbugReporter, we could also implement it with signature:

func PublishAsync(reporter *HumbugReporter, report Report) (func, <-bool) {
    ...
}

Humbug Python should rate limit events on the client side

Add a mode which gives callers a job queue to submit reports (rather than request/response).

BUGGER_OFF=true to disable all humbug reporting from any library

For people who want to opt out of any Humbug reporting across any tool with a single setting.

Nil checks on HumbugReporter methods

All of the methods for HumbugReporter work on a *HumbugReporter. They should all do a nil check and, if nil, should print a warning and noop.

Allow integrations to specify blacklist of parameters to send with function usage reports

This will allow Humbug customers to hide sensitive parameter values from usage reports.

Report error names with Javascript error reports

Currently, we have no way of grouping errors together (e.g. by name in Python) in the Javascript library.

Change "bugout_token" member of Reporter to "report_token"

The tokens being used for reporting do not have the full power of Bugout API tokens, and the naming should reflect this.

Relevant line of code:

humbug/python/humbug/report.py

Line 49 in 66de8d4

bugout_token: Optional[str] = None,

Humbug Python should add tags when it is run in a Jupyter notebook

User request was to identify when Humbug is run in colab. There are a few environment variables available there that will help us. E.g. COLAB_GPU.

For other Jupyter environments, this may be a little tougher. Requires some investigation on Kaggle and in local Jupyter notebooks and in Jupyter lab.

Give Humbug integrations ability to present users with reports when asking for consent

Consent is checked on every report publication anyway, so it makes sense for consent.ConsentMechanism functions to take an optional Report argument. This can be used to present users with the contents of the report for approval, so they can see exactly what they are consenting to.

Future flows along these lines: users are presented with a raw report and they can modify it prior to giving consent.

Something to think about, the only change required here would be to change the definition of ConsentMechanism from Callable[[], bool] to Callable[[Optional[Report]], bool].

As it stands now, this would introduce a circular dependency between humbug.consent and humbug.report, so we would have to move Report into another file (e.g. data.py).

Sanitize filesystem paths in reports

We should clean up paths (for example in stack traces) which look like /home/zomglings/blahblahblah to not include things like username.

One way to make progress in Python is to make all paths relative to current directory or to project root directory.

Method to compose multiple reports into a single report

For example, system report + error report.

Common consent values should be available from consent package

e.g.

consent.yes = ["1", "y", "Y", ...]

and

consent.no = ["0", "n", "N", ...]

Create consent mechanism that disables reporting when Python module is imported as library

Some options. We should implement at least one of the following options, but we do not have to implement them all.

Using inspect

Here is an example of how to detect this automatically:

def is_in_library():
    stack = inspect.stack()[:-1]
    if all("site-packages" not in finfo.filename for finfo in stack):
        print("(debug) Not in library!")
        return False
    return True  # don’t send statistics!

Taken from Ray telemetry proposal: https://docs.google.com/document/d/1gZut2v52xDd3bNBaw2PxilUBWQRixIQcOJPx16FIoAw/edit#

Now, I don't like using inspect.stack (see this Stack Overflow post and this Python mailing list thread for some reasons why).

However, we can implement this and leave it up to the creator of the consent mechanism whether or not they should use it.

Stateful consent switch - off when used as library, on when used as standalone tool

Create a stateful consent mechanism with a boolean switch. By default, the switch will be False and the mechanism will not grant consent. This way, if a tool which integrates with Humbug can be used as a library, any dependents of that library will not send reporting.

However, if the tool can also be used from the command line, the CLI can switch the state of the mechanism to True to enable reporting.

Tests should run across multiple Python versions and operating systems

On GitHub Actions, we can use a matrix.

HumbugConsent mechanisms should have a type which is a union of boolean and () => boolean.

The way HumbugConsent is implemented in the javascript library doesn't allow for rich consent flows which resolve consent dynamically at runtime. They only allow for consent resolution at the time that the reporter is created.

Make it look like the Python or Go version of HumbugConsent.

Add timeout to prompt_user consent mechanism

It should be possible to define a user prompt (using humbug.consent.prompt_user) with a default value and a timeout period. If the prompt times out, consent should move forward with the default value.

Add Python tests for HumbugReporter

Our Python 3.6 tests did not catch the issue that dataclasses is not a standard package.

(See this PR: #84)

It means that we don't have any tests for HumbugReporter. We need to add such tests.

Add "error:<error_type>" tag to error reports

Current, error reports only take the additional tag: type:error.

This doesn't help to understand the distribution of specific errors (for example ValueErorr in Python).

We should also add the name of the error as a tag. For example, in Python, any ValueError reported to Bugout should also get the tag: error:ValueError.

For errors that are imported from other libraries (e.g. fastapi.HTTPException), we should try to add the fully qualified error path: error:fastapi.HTTPException rather than error:HTTPException).

For errors defined in the package integrating with Humbug, we can use relative paths to start with.

The sample code handles it correctly, so you can just follow that pattern.

Ability to set tags on HumbugReporter instance which get published with every report

Don't use blanket "except" in Reporter.publish

https://github.com/bugout-dev/humbug/blob/main/python/humbug/report.py#L128

See comment by u/ElevenPhonons on Reddit: https://www.reddit.com/r/Python/comments/m84h10/humbug_usage_and_crash_reports_for_python/grh7ads?utm_source=share&utm_medium=web2x&context=3

We should at least handle KeyboardInterrupt in a more responsible manner than just eating them up. This should probably apply to everything that gets caught that isn't a subclass of Exception in the exception hierarchy: https://docs.python.org/3/library/exceptions.html#exception-hierarchy

Respect DO_NOT_TRACK environment variable

See: https://consoledonottrack.com/

We have already implemented BUGGER_OFF=1. We should just add DO_NOT_TRACK=1 as well with the same functionality.

We have to do this in every client library.

HumbugReporter.record_call doesn't work well when stacked above @staticmethod

This is a report from a user. I will verify the issue and add the stack trace here.

Report which sends back installed libraries and versions

For Python, basically a report which sends a pip freeze.