Git Product home page Git Product logo

cfn-trace's Introduction

High-Level Interests

Web Accessibility

  • Making things as simple as possible for every researcher, designer, developer, product manager, etc... to readily understand the experiences of disabled people, leaving no viable economic argument for not making products accessible.

Observability

  • Making it as easy as possible for engineers to ask arbitrary questions of the humans, code, and infrastructure comprising the sociotechnical systems they are responsible for

Testing

  • Deriving automated test coverage from production telemetry
  • Minimizing regressions while spending the least amount of effort possible doing so
  • Evaluating and understanding the level of safety a test suite provides against regressions
  • Finding ways to practice software testing against simulations or "recordings" of rapidly changing real-world products

cfn-trace's People

Contributors

grunet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

fulan0dental

cfn-trace's Issues

Add Debug Level Logging (or similar) to the Binary to make Troubleshooting Easier

The areas I was thinking of are:

  • Dumping raw responses from the AWS SDK
  • Turning on the ConsoleSpanExporter (it's commented out right now)
  • Turning on the OTEL SDK's internal diagnostic logging

I'm mostly unsure as to the best CLI flag/parameter to use for this (should this be consistent with the aws cli? Or the other cfn binary tools? Or have a separate loglevel input? Or a verbosity flag? What happens if it doesn't only use logs at some point? And so on and so forth...)

Can Attempt to Include Events From Older Deploys

Haven't reproduced this yet, but I realized that the for every stack that's encountered in the set of nested stacks, the tool currently tries to work with events from the latest deploy from each of them.

However those aren't necessarily from the latest deploy of the root stack. Like if you had a substack that was created once and never touched again, the code as-is would try to pull in its events even when trying to analyze the latest deploy of the root stack.

The code should instead ignore any events from any stack that precede the event that marks the start of the most recent deploy of the root stack.

Service Name is Missing on All of the Spans

The screenshot from the READme illustrates this pretty well.

This doc from otel-js seems to confirm that the "service.name" standard attribute from the opentelemetry spec needs to be set on the spans for this.

Maybe "Cloudformation" is a good default name to hard-code to start with?

Make it Easier to Pass the Binary AWS Credentials and Settings

I kind of want to avoid letting the binary read from users' ~/.aws/ directory automatically, so the other option that came to mind was passing things like the access key, secret, and region as CLI parameters.

I'm not sure how much this would improve ergonomics of using the binary (maybe simplifies certain CI usage?) but wanted to at least capture the idea here.

Update CI to Format Files and Commit Any Changes

Right now deno fmt is being run in CI mostly to check for anything brazenly wrong with the files.

Instead of having folks have to remember to run make format before publishing a PR, it seems like it'd be nice to just automatically do that for them in CI (or a pre-commit-hook if there's a simple, non-obtrusive option for that)

I didn't see any ready made recipe for doing this with Deno (or an out-of-the-box solution) like I believe Prettier's ecosystem has, but I'm sure there's a pattern to follow somewhere that just needs to be found and tried out.

Large Enough Stacks Will Probably Start To Have Span Data Missing

This is coming from the code not currently handling pagination of the AWS JS SDK calls at all (it's just making one call and calling that good).

This AWS doc about when pagination starts suggests its around when the response gets to 1 MB. When I measured the hollow-nesting-only (root) stack it came out to about 4000 bytes per deploy, which implies that it's roughly when a stack is 250 times as big (so like 250 resources) when problems would start to appear

Handle Stack Creation Scenarios Too

Right now the binary only works on (root) stack updates, because mvp and whatnot.

It should also work on stacks that were just created as well.

Incorporate Stack Outputs into Stack Spans

Cloudformation stacks can output values (link to AWS doc on outputs) that can be used in other stacks, but this info isn't visible in the converted span data anywhere right now.

It might be cool to add this info somewhere on the span corresponding to each stack (and make it more easily searchable probably too).

Add a "--help" Option for the Binary

Right now the help documentation for the CLI is present in the READme, but that's kind of not where anyone would probably anticipate it being when trying to use this tool if I had to guess.

Instead, it would be nice to be consistent with other CLI tools and offer a --help or similar option that lists out what is possible to do with the CLI and other helpful pointers

Publish an "All in One" Bundle that Includes the OpenTelemetry Collector

Right now just to get an end-to-end flow working with the binary, you also have to get an OpenTelemtery Collector setup and working, which is at a minimum annoying.

It might simplify setup to create something that bundles both the binary and the OpenTelemetry Collector together (e.g. like a docker image) that you can just download and immediately run and get results.

Incorporate Stack Parameters into Stack Spans

Cloudformation stacks can take in parameters, but this info isn't visible in the converted span data anywhere right now.

It might be cool to add this info somewhere on the span corresponding to each stack (and make it more easily searchable probably too).

Incorporate When Errors Happen in a Cloudformation Stack Deploy

Right now the tool assumes a happy path of a stack update working without issue.

However in reality a stack might fail to create or update for some reason (and usually rollback).

It would be cool to handle those scenarios, maybe marking relevant spans as error spans and adding attributes for the brief text blurbs Cloudformation gives on its error events.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.