Git Product home page Git Product logo

xraysink's Introduction

xraysink

Extra AWS X-Ray instrumentation to use distributed tracing with asyncio Python libraries that are not (yet) supported by the official aws_xray_sdk library.

What Problem Does xraysink Solve?

aws_xray_sdk is the standard library to collect trace data from your Python code and send the trace data to the AWS X-Ray distributed tracing tool. However, if you have asyncio Python code, then there are some gaps and occasional bugs in the functionality provided by that library. xraysink plugs those gaps.

It can be a bit confusing using two libraries together, so here's a high-level breakdown of which library will help you do what:

  • Add tracing to HTTP requests handled by FastAPI (or another async Python web framework): xraysink (via middleware)
  • Add tracing to background (non-HTTP-request) functions written as async Python functions: xraysink (via xray_task_async decorator)
  • Everything else: aws_xray_sdk

Integrations Supported

  • Generic ASGI-compatible tracing middleware for any ASGI-compliant web framework. This has been tested with:
  • asyncio Task's
  • Background jobs/tasks

Installation

xraysink is distributed as a standard python package through pypi, so you can install it with your favourite Python package manager. For example:

pip install xraysink

How to use

xraysink augments the functionality provided by aws_xray_sdk. Before using the tools in xraysink, you first need to configure aws_xray_sdk - this will probably involve calling xray_recorder.configure() when your process starts, and optionally aws_xray_sdk.core.patch().

Extra instrumentation provided by xraysink is described below.

FastAPI

Instrument incoming requests in your FastAPI web server by adding the xray_middleware to your app. For example, you could do:

from starlette.middleware.base import BaseHTTPMiddleware
from xraysink.asgi.middleware import xray_middleware

# Standard asyncio X-Ray configuration, customise as you choose
xray_recorder.configure(context=AsyncContext(), service="my-cute-little-service")

# Create a FastAPI app with various middleware
app = FastAPI()
app.add_middleware(MyTracingDependentMiddleware)  # Any middleware that is added earlier will have the X-Ray tracing context available to it
app.add_middleware(BaseHTTPMiddleware, dispatch=xray_middleware)

Asyncio Tasks

If you start asyncio Task's from a standard request handler, then the AWS X-Ray SDK will not correctly instrument any outgoing requests made inside those Tasks.

Use the fixed AsyncContext from xraysink as a drop-in replacement, like so:

from aws_xray_sdk.core import xray_recorder
from xraysink.context import AsyncContext  # NB: Use the AsyncContext from xraysink

# Use the fixed AsyncContext when configuring X-Ray,
# and customise other configuration as you choose.
xray_recorder.configure(context=AsyncContext(use_task_factory=True))

Background Jobs/Tasks

If your process starts background tasks that make network calls (eg. to the database or an API in another service), then each execution of one of those tasks should be treated as a new X-Ray trace. Indeed, if you don't do so then you will likely get context_missing errors.

An async function that implements a background task can be easily instrumented using the @xray_task_async() decorator, like so:

from aws_xray_sdk.core import xray_recorder
from xraysink.tasks import xray_task_async

# Standard asyncio X-Ray configuration, customise as you choose
xray_recorder.configure(context=AsyncContext(), service="my-cute-little-service")

# Any call to this function will start a new X-Ray trace
@xray_task_async()
async def cleanup_stale_tokens():
    await database.get_table("tokens").delete(age__gt=1)

# Start your background task using your scheduling system of choice :)
schedule_recurring_task(cleanup_stale_tokens)

If your background task functions are called from a function that is already instrumented (eg. send an email immediately after handling a request), then the background task will appear as a child segment of that trace. In this case, you must ensure you use the non-buggy AsyncContext when configuring the recorder (ie. from xraysink.context import AsyncContext)

CloudWatch Logs integration

You can link your X-Ray traces to your CloudWatch Logs log records, which enhances the integration with AWS CloudWatch ServiceLens. Take the following steps:

  1. Put the X-Ray trace ID into every log message. There is no convention for how to do this (it just has to appear verbatim in the log message somewhere), but if you are using structured logging then the convention is to use a field called traceId. Here's an example

    trace_id = xray_recorder.get_trace_entity().trace_id
    logging.getLogger("example").info("Hello World!", extra={"traceId": trace_id})
    
  2. Explicitly set the name of the CloudWatch Logs log group associated with your process. There is no general way to detect the Log Group from inside the process, hence it requires manual configuration as part of your process initialisation (eg. in the same place where you call xray_recorder.configure).

    set_xray_log_group("/example/service-name")
    

Note that this feature relies on undocumented functionality, and is not yet supported by the official Python SDK.

Licence

This project uses the Apache 2.0 licence, to make it compatible with aws_xray_sdk, the primary library for integrating with AWS X-Ray.

xraysink's People

Contributors

fullerzz avatar garyd203 avatar github-actions[bot] avatar mroswald avatar tim-connolly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

xraysink's Issues

Upgrade to support latest aws-xray-sdk

  • Update standard version we test against to v2.12
  • See if there's anything new that means we can modify or deprecate our functionality
  • Check if aws/aws-xray-sdk-python#340 solves the underlying problem, and create an issue for us to lean into that if it does. else mebbe push our better fix upstream if easy
  • Write test to verify behaviour of core library in scenario where a xray trace does not exist -> See #105

Fix instrumentation on independent (background) asyncio tasks

The standard AWS SDK appears to propagate the same X-Ray segment into the asyncio context for asyncio tasks that are started from a traced asyncio context (see aws_xray_sdk.async_context.task_factory). This means that when that task makes remote calls (ie. creates a subsegment) after the initial segment has been completed, then there is an error (because a subsegment has to be wholly within the parent segment). Instead, let's look at starting a trivial subsegment in the initial segment to start the background task, and then the background task starts a new segment with the subsegment as it's parent.

Ideally we want the resulting trace to look like the SQS instrumentation.

Use case is a request handler that starts a background task for some long-running processing.

Mebbe don't start a new segment (or subsegment for that matter) automatically. rather, do it only for asyncio tasks/coros that are explicitly marked - eg. with @xray_task_async. This will mean we want a few more fields on that decorator (eg. segment name might be useful now)

General use questions

I'm trying to get this going, but feeling like the documentation is confusing.

xray_recorder.configure(context=AsyncContext(), service="my-cute-little-service")

# Create a FastAPI app with various middleware
app = FastAPI()
app.add_middleware(MyTracingDependentMiddleware)  # Any middleware that is added earlier will have the X-Ray tracing context available to it
app.add_middleware(BaseHTTPMiddleware, dispatch=xray_middleware)
  • which BastHTTPMiddleware is this referring to?
  • what is xray_middlewhere
  • do we import xraysink anywhere?
  • I'm using decorators on my endpoints @xray_recorder.capture_async("status") should I continue using these?

do something with coverage

Support async+thread context propagation in FastAPI

When FastAPI runs a non-async dependency or handler, it puts it into an internal thread pool so that the async execution doesn't get blocked. I'm pretty sure that the trace context doesn't get propagated into that thread pool (see Sean's problem in #88), which is obviously undesirable. It'd be nice if this worked seamlessly instead.

We probably need a new context (or just extend the existing no-bugs AsyncContext in xraysink) which can hook into FastAPI's specific non-async dispatch in order to add the trace context to those calls. This work could be fairly large!

Possibly related, but probably different piece of work: Support calling an instrumented sync function from async code (eg. see aws/aws-xray-sdk-python#164 (comment))

encode/starlette#1258 may be a helpful starting point

Question AWS- Background TASK

My question is, how can I make xray trace my calls to the database, this would be a background task. Currently you can see that the task is running but it does not give me information about the host and also the task is repeated, what is this due to?

AWS-RAY conf
image

This is aws-trace
image

This is my async task
image

Add more style checking

  • find a good tool
  • configure for python and run in CI
  • configure for yaml and run in CI
  • enforce a good import order

auto-build `master` commits

Whenever a new commit is pushed to master (TBD: Or maybe some other trigger like setting a tag???), automatically do a minor version release, with associated version bumping and whatnot.

desired actions:

  • validate we are on master
  • run tests (again)
  • bump package version (in multiple places) from 1.2.3-rc to 1.2.3
  • check git tag matches in-code version
  • do poetry build, smoke test on artifact, then poetry publish
  • add git tag (and implied github release)
  • bump package version to 1.2.4-rc for continued development

Ideas for a trigger:

  1. X create an issue by me, with a specifically formatted subject, or special label
  2. X use a deployment ?? Not sure that makes conceptual sense
  3. do a release. That's back to front though, I feel.
  4. X also rely on the workflow_run:completed event for a dependent workflow
  5. create a tag. this is identical to creating a release, really.
  6. -->> manually run the workflow https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow

one method that might work:

  • use a script to bump version strings and add tag, and push upstream
  • have a workflow that listens for the tag and does the rest

Problem getting host when running under http/2

Using HTTP2 to an AWS ELBv2 in front of an ECS Service:

File "/opt/venv/lib/python3.9/site-packages/starlette/middleware/base.py", line 30, in coro
await self.app(scope, request.receive, send_stream.send)
File "/opt/venv/lib/python3.9/site-packages/starlette/middleware/base.py", line 55, in __call__
response = await self.dispatch_func(request, call_next)
File "/opt/venv/lib/python3.9/site-packages/xraysink/asgi/middleware.py", line 46, in xray_middleware
request.headers["host"].split(":", 1)[0], xray_recorder
File "/opt/venv/lib/python3.9/site-packages/starlette/datastructures.py", line 542, in __getitem__
raise KeyError(key)
KeyError: 'host'

Other dependencies

  • FastApi v0.70.0
  • uvicorn v0.17.4
  • no gunicorn

Research:

Set `user` on the X-Ray segment from ASGI scope or framework's request

If the X-Ray middleware is called after any authentication/session middleware, then the ASGI scope will probably contain some sort of reference to the authenticated user, which we can use to automatically set the user field on the Segment.

Here's some examples of how different web frameworks set the user:

  • aiohttp has no built-in user concept.
  • aiohttp-jwt will store the decoded JWT in a configurable key on the request (default payload)
  • FastAPI does what Starlette does
  • Piccolo extends Starlette's AuthenticationMiddleware (see below), and also adds a secret_token_user key to the scope.
  • Starlette's builtin AuthenticationMiddleware adds a user key to the scope and a user attribute on the request. So any of scope["user"].display_name or request.user.display_name is correct (but also check user.is_authenticated)

tracing exclusion rules for FastAPI endpoints

Very cool - thanks for the FastAPI middleware. Since the middleware means we don't have to decorate each function, i have the opposite problem - a situation where i'd like to exclude tracing for a particular endpoint. Is there a pattern for that which the AWS X-Ray SDK uses in cases like that? A function decorator? A path config used by the middleware?

FastAPI implementation throwing missing segment

Hi, I am trying to utilize this library to implement X-Ray into my FastAPI api.

I have the following code:

    xray_recorder.configure(context_missing="LOG_ERROR")
    xray_recorder.configure(service="API DEV")
    xray_recorder.configure(plugins=("ECSPlugin",))
    xray_patch_all()
    app.add_middleware(BaseHTTPMiddleware, dispatch=xray_middleware)

But on each request I send, I see the daemon only receives 1 segment. Usually with a response time of 0.04ms, which definitely isn't the case. I see that this segment is sent before the response is even sent to the client.

Along with this, the following is logged in my application:
cannot find the current segment/subsegment, please make sure you have a segment open

Is there a working example of this working with FastAPI? Am I missing something here? Thanks!

xraysink on FastAPI with boto3

Question/query - has anyone successfully used boto3 with xraysink? or particularly secrets manager?

    xray_recorder.configure( context=AsyncContext() )
    libraries = (['botocore','boto3','psycopg2'])
    patch(libraries)
    app.add_middleware(BaseHTTPMiddleware, dispatch=xray_middleware) 

Some of the symptoms I'm seeing appears that mixing the two may potentially be an issue (mixing async context in xray recorder w/ boto3 sync methods)?

My app logs shows things like this:

Could not get Secret Manager credentials, exception: [ cannot find the current segment/subsegment, please make sure you have a segment open ]

Add changelog to pypi

I'm not entirely sure now that I want to do this for pypi...

However, I think we can do this by concatenating (part of) it to the README in the pyproject.toml.

General Usage Question #2

Following on
#22

I noticed the use of AsyncContext for FastApi, is this using the xraysink or aws sdk? Is it needed?

I've attempted an implementation with either and have found that my async request handlers are unable to retrieve the context. This leads to AttributeErrors or just general errors where the segment cannot be found

Set up CI with GitHub Actions

  • PR test & style check
  • master-branch auto build and version bump -> Split out to #51
  • Matrix testing with versions of aws-xray-sdk and the 3rd party libraries we integrate with (eg. aiohttp and fastapi)

Set log group on the x-ray segment

Be able to explicitly set the AWS CLoudwatch Group used for this segment. This appears to be set using the field segment['aws']['cloudwatch_logs']['logGroup'] (according to the Java SDK). NB: More likely to be log_group since everything else is snakecased

None of the python plugins currently set this (as of SDK v2.6.0), and it is unknowable for execution environments like ECS anyway. So let's provide the convenience. aws/aws-xray-sdk-python#188 tracks the official SDK's implementation of this feature

Fix instrumentation on dependent (gathered) asyncio tasks

subsegment parents should work correctly. Use case is a request handler that spawns multiple asyncio tasks, and then gathers them all together to compose the response (eg. tartiflette)

My working theory is that aws-xray-sdk has a bug where it copies a reference to the context dictionary onto a new task, rather than deep/shallow -copying the values. This menas that subsegments do push/pop on the same list, so the push/pop could be interleaved in an incorrect order.

This bugfix should definitely be moved into the upstream aws-xray-sdk-python repo as a PR.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.