datadog / ecommerce-workshop Goto Github PK
View Code? Open in Web Editor NEWExample eCommerce App for workshops and observability
License: Other
Example eCommerce App for workshops and observability
License: Other
Another optimization we can make to this project is to correct the timeline of the git history. Originally this repo was created as a copy of a clone of https://github.com/spree/spree which brought 20,000+ commits of history from that repo which are in no way related to the work our project. The history in this repo starts at 4f09169 and I propose we remove all the history up to and including f8e7b4e.
This will benefit the project by drastically cutting down the overall history you need to synchronize any time you clone the project to contribute code. It will also reduce all the Katacoda scenario image sizes with less changes to synchronize.
For better inclusivity in the modern tech world, we should be ditching old terminology like master/slave for main/leader/follower everywhere we can. This is the new default for Github already and migrating will have bigger implications for downstream consumers of this repo since it wasn't setup with main by default.
For this Learning Center RUM scenario, we hacked Storedog to fetch ads and discounts async via XHR. This required adding CORS to those services. This would be useful as a permanent feature.
See assets/discounts.py
and assets/ads.py
for the minor changes necessary.
Replace the sleep errors in the app with N+1 query per this suggestion:
Maybe we can make it an N+1 query? Because there would be a hint on the flame graph as well so that could be worked into the steps to look there, notice it's repeatedly hitting the db, and then go look? More realistic than a sleep in production, and also showcases how datadog APM gets you to a root cause
To help make things quicker to spin up across platforms, we should cross-publish all containers to Google Container Registry (GCR) under the datadog-community project. This might be an easy lift in updating our workflows to publish to both, but I don't know the complexity of it yet.
On attempt to build the store-frontend-broken-no-instrumentation it looks like there's an error with a pulled depedency
docker build .
[+] Building 13.6s (11/13)
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 694B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/ruby:2.5.1 0.7s
=> [1/9] FROM docker.io/library/ruby:2.5.1@sha256:ac6661b87cf49af14b1930 0.0s
=> [internal] load build context 0.9s
=> => transferring context: 28.66MB 0.9s
=> CACHED [2/9] RUN apt-get update -qq && apt-get install -y build-ess 0.0s
=> CACHED [3/9] COPY . /spree 0.0s
=> CACHED [4/9] WORKDIR /spree 0.0s
=> CACHED [5/9] RUN bundle install 0.0s
=> CACHED [6/9] RUN bundle update sassc && bundle exec rake store-fronte 0.0s
=> ERROR [7/9] RUN cd store-frontend && bundle update sassc 11.8s
------
> [7/9] RUN cd store-frontend && bundle update sassc:
#9 0.530 fatal: Not a git repository (or any of the parent directories): .git
#9 0.537 fatal: Not a git repository (or any of the parent directories): .git
#9 0.543 fatal: Not a git repository (or any of the parent directories): .git
#9 0.546 fatal: Not a git repository (or any of the parent directories): .git
#9 0.550 fatal: Not a git repository (or any of the parent directories): .git
#9 0.554 fatal: Not a git repository (or any of the parent directories): .git
#9 0.559 fatal: Not a git repository (or any of the parent directories): .git
#9 0.749 The dependency tzinfo-data (>= 0) will be unused by any of the platforms Bundler is installing for. Bundler is installing for ruby but the dependency is only for x86-mingw32, x86-mswin32, x64-mingw32, java. To add those platforms to the bundle, run `bundle lock --add-platform x86-mingw32 x86-mswin32 x64-mingw32 java`.
#9 0.751 Fetching https://github.com/spree/spree_gateway.git
#9 1.666 Fetching https://github.com/spree/spree_auth_devise.git
#9 3.011 Fetching gem metadata from https://rubygems.org/...........
#9 6.438 Fetching gem metadata from https://rubygems.org/.............
#9 8.264 Fetching gem metadata from https://rubygems.org/.............
#9 10.85 Resolving dependencies....
#9 11.35 Your bundle is locked to mimemagic (0.3.3), but that version could not be found
#9 11.35 in any of the sources listed in your Gemfile. If you haven't changed sources,
#9 11.35 that means the author of mimemagic (0.3.3) has removed it. You'll need to update
#9 11.35 your bundle to a version other than mimemagic (0.3.3) that hasn't been removed
#9 11.35 in order to install.
------
executor failed running [/bin/sh -c cd store-frontend && bundle update sassc]: exit code: 7
make: *** [build] Error 1
Currently, applying a discount code to a shopping cart does not work; the UI displays a "code not found" error. It would be awesome if storedog actually used the discounts service to reduce checkout totals.
Use case: a synthetic browser test demo that would scrape the code and value from the home page and apply the code to the cart. It would then do the math to confirm that the new cart total is correct.
In order for us to reliably verify changes to dependencies or code, we should get into the habit of CI testing the discounts service via CI.
Script a headless browser to interact with the Storedog UI. Perhaps part of the traffic container? Selenium, PhantomJS, etc.
For an upcoming course about custom span tags, custom method instrumentation, and generating metrics from spans....we need access to some interesting functionality to tell the story.
Currently the narrative we are looking to hit is that of a problem with the shopping cart. The shopping cart should sometimes fail, sometimes be successful through checkout, and also allow for custom tagging to expose cart totals.
The current cart implementation doesn't lend to demonstrating this functionality, nor being able to tag spans or methods associated.
The RUM configuration for Storedog is hardcoded to production
. Proposal for template (there are a few):
sed -i 's/production/<%= ENV["DD_ENV"] %>/g' ./store-frontend-instrumented-fixed/store-frontend/app/views/spree/layouts/spree_application.html.erb
Clicking on the Add to Cart button for a product in the embedded synthetic browser test recorder results in "Your cart is empty."
I see no relevant errors in the browser console, logs, or traces. I suspect it might be a cookie problem related to being in an iframe. I dug around in the code to find where I could set SameSite=None
on the token
and guest_token
cookies, but failed.
Current logging configuration requires users to create and update a clone of the Ruby integration pipeline. Need a logging configuration that will work with the OOTB Ruby integration pipeline. (Discussed with Jeremy)
Using the ECS example (https://github.com/DataDog/ecommerce-workshop/blob/main/deploy/aws/ecs/shop-task.json#L63) the container exits with:
/usr/local/lib/ruby/gems/2.5.0/gems/bundler-1.16.6/lib/bundler/rubygems_integration.rb:408:in `block (2 levels) in replace_gem': Error loading the 'postgresql' Active Record adapter. Missing a gem it depends on? pg is not part of the bundle. Add it to your Gemfile. (LoadError)
Hello team,
I have deployed this Storedog app on K8:
https://github.com/DataDog/ecommerce-workshop/tree/main/deploy/generic-k8s/ecommerce-app
When i trigger the app from browser i see its broken.
Can u share a working application yaml files ?
Per this comment we should look at slimming down the advertisements and discounts services to remove any extra dependencies like the RandomWord one.
Since github can now host containers, it would be a great idea to automate the docker container builds and publish them up here as artifacts.
We are going to start versioning our containers and repository so the training material that depends on this project can stay stable and predictable. The following checklist will become part of the RELEASING.md
doc which will eventually be fully automated. For now it'll be a very manual procedure because what is life without pain before automation to take that away? 🤔
Checklist:
ddtraining
(Look in the menu for Docker Desktop, you should see a ddtraining
item just above the "Quit Docker Desktop" option). There is no reliable way to check this programmatically right now 😢)CHANGELOG.md
entry by changing [unreleased]
to [1.0.0] YYYY-MM-DD
git add CHANGELOG.md && git commit --message="Release version 1.0.0"
git tag --annotate 1.0.0 --message="Release 1.0.0"
git push && git push --tags
CHANGELOG.md
notes for the 1.0.0 release in the body and put "1.0.0" for the title.docker pull ddtraining/advertisements:latest
docker tag ddtraining/advertisements:latest ddtraining/advertisements:1.0.0
docker push ddtraining/advertisements:1.0.0
docker pull ddtraining/advertisements-fixed:latest
docker tag ddtraining/advertisements-fixed:latest ddtraining/advertisements-fixed:1.0.0
docker push ddtraining/advertisements-fixed:1.0.0
docker pull ddtraining/discounts:latest
docker tag ddtraining/discounts:latest ddtraining/discounts:1.0.0
docker push ddtraining/discounts:1.0.0
docker pull ddtraining/discounts-fixed:latest
docker tag ddtraining/discounts-fixed:latest ddtraining/discounts-fixed:1.0.0
docker push ddtraining/discounts-fixed:1.0.0
docker pull ddtraining/storefront:latest
docker tag ddtraining/storefront:latest ddtraining/storefront:1.0.0
docker push ddtraining/storefront:1.0.0
docker pull ddtraining/storefront-fixed:latest
docker tag ddtraining/storefront-fixed:latest ddtraining/storefront-fixed:1.0.0
docker push ddtraining/storefront-fixed:1.0.0
As a course developer in the datadog training platform I can access a Kali Linux or similar container with attack tools. The container should feature flag spin up with the rest of the environment and provide the ability to run specific automation.
The container bootstrap should be such that we can extend the number of attacks and containers impacted.
A new bug introduced with #80 is making the dd-trace-rb library not emit any tracing data. During startup there is this log message which might be a regression in the auto-instrumentation library.
W, [2021-03-30T21:25:44.904436 #8] WARN -- ddtrace: [ddtrace] Unabe to enable Datadog Trace context, Logger #<SemanticLogger::Logger:0x0000556c61b81ac8> is not supported
I will file a bug with the tracer and work with them to get this resolved. For now, it's blocking the 1.0 release.
To start unifying the deployment story we should convert the various deployment methods into terraform modules so we can target different platforms and have a more out of the box experience. This also includes ample documentation to make it easier for anyone to deploy this project where they need to.
Any idea what could be an issues with container. I am using latest version of repo.
`
[root@ip-172-31-12-118 traffic-replay]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
54e11f2afb60 traffic-replay "/bin/sh -c 'wait-fo…" 8 minutes ago Up 8 minutes distracted_liskov
7912f14cb332 ddtraining/storefront:2.1.0 "sh docker-entrypoin…" 12 minutes ago Up 12 minutes 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp docker-compose-frontend-1
013da90c3c49 ddtraining/discounts:2.1.0 "/usr/bin/dumb-init …" 12 minutes ago Exited (2) 12 minutes ago docker-compose-discounts-1
6dd6bc41963a ddtraining/advertisements:2.1.0 "ddtrace-run flask r…" 12 minutes ago Up 12 minutes 0.0.0.0:5002->5002/tcp, :::5002->5002/tcp docker-compose-advertisements-1
8698bafba68a datadog/agent:7.29.0 "/init" 12 minutes ago Up 12 minutes (unhealthy) 8125/udp, 0.0.0.0:8126->8126/tcp, :::8126->8126/tcp docker-compose-agent-1
c8718bf8adfa postgres:11-alpine "docker-entrypoint.s…" 12 minutes ago Up 12 minutes 5432/tcp docker-compose-db-1
[root@ip-172-31-12-118 traffic-replay]# docker logs 013da90c3c49
Starting OpenBSD Secure Shell server: sshd.
rsyslogd: imklog: cannot open kernel log (/proc/kmsg): Operation not permitted.
rsyslogd: activation of module imklog failed [v8.1901.0 try https://www.rsyslog.com/e/2145 ]
Starting enhanced syslogd: rsyslogd.
Usage: flask run [OPTIONS]
Try 'flask run --help' for help.
Error: Invalid value for '--port' / '-p': is not a valid integer
2022-06-17 03:07:25,128 ERROR [ddtrace.profiling.scheduler] [scheduler.py:52] [dd.service=discounts-service dd.env=development dd.version=1.0 dd.trace_id=0 dd.span_id=0] - Unable to export profile: ddtrace.profiling.exporter.ExportError: HTTP Error 403
. Ignoring.
Failed to start my_second_process: 2
[root@ip-172-31-12-118 traffic-replay]#
`
Create a bug in the coupon code logic, and showcase it using APM
The application should be deployable with logs in an unstructured state. The user should be able to modify the logging library or formats in order to provide a normalized log structure.
Full disclosure, I haven't used this project in quite some time and my fork of the code became outdated with deprecated ruby dependency packages. While trying to get a more recent snapshot of the code, I deployed the project using the latest code from the main
branch. The default page loads eventually after a couple of refreshes and the DB is populated with records and schema. However the problem is I get this error message when clicking the Cart button or trying to navigate to the /admin
url.
I have tried using all three ads-service folders source code (broken, fixed, errors) with the same result. Also tried alternate code for the other services (discount and store-frontend) with no success.
Something to note is I'm not running Datadog on my cluster presently, but i would imagine this would be fine. I do see errors reported in the logs.
[advertisements] 2021-08-06 06:55:40,688 ERROR [ddtrace.internal.writer] [writer.py:202] [dd.service=advertisements-service dd.env=development dd.version=1.0.1 dd.trace_id=0 dd.span_id=0] - failed to send traces to Datadog Agent at http://172.18.0.5:8126
[advertisements] Traceback (most recent call last):
[advertisements] File "/usr/local/lib/python3.9/site-packages/ddtrace/internal/writer.py", line 200, in _send_payload
[advertisements] response = self._put(payload, headers)
[advertisements] File "/usr/local/lib/python3.9/site-packages/ddtrace/internal/writer.py", line 165, in _put
[advertisements] conn.request("PUT", self._endpoint, data, headers)
[advertisements] File "/usr/local/lib/python3.9/http/client.py", line 1257, in request
[advertisements] self._send_request(method, url, body, headers, encode_chunked)
[advertisements] File "/usr/local/lib/python3.9/http/client.py", line 1303, in _send_request
[advertisements] self.endheaders(body, encode_chunked=encode_chunked)
[advertisements] File "/usr/local/lib/python3.9/http/client.py", line 1252, in endheaders
[advertisements] self._send_output(message_body, encode_chunked=encode_chunked)
[advertisements] File "/usr/local/lib/python3.9/http/client.py", line 1012, in _send_output
[advertisements] self.send(msg)
[advertisements] File "/usr/local/lib/python3.9/http/client.py", line 952, in send
[advertisements] self.connect()
[advertisements] File "/usr/local/lib/python3.9/http/client.py", line 923, in connect
[advertisements] self.sock = self._create_connection(
[advertisements] File "/usr/local/lib/python3.9/socket.py", line 843, in create_connection
[advertisements] raise err
[advertisements] File "/usr/local/lib/python3.9/socket.py", line 831, in create_connection
[advertisements] sock.connect(sa)
[advertisements] ConnectionRefusedError: [Errno 111] Connection refused
[advertisements] 2021-08-06 06:55:43,022 INFO [werkzeug] [_internal.py:113] [dd.service=advertisements-service dd.env=development dd.version=1.0.1 dd.trace_id=0 dd.span_id=0] - 10.244.1.5 - - [06/Aug/2021 06:55:43] "GET /ads HTTP/1.1" 200 -
[advertisements] 2021-08-06 06:55:43,025 INFO [bootstrap] [ads.py:25] [dd.service=advertisements-service dd.env=development dd.version=1.0.1 dd.trace_id=1143986986509166125 dd.span_id=3336431127800903886] - attempting to grab banner at 2.jpg
[advertisements] 2021-08-06 06:55:43,027 INFO [werkzeug] [_internal.py:113] [dd.service=advertisements-service dd.env=development dd.version=1.0.1 dd.trace_id=0 dd.span_id=0] - 10.244.1.5 - - [06/Aug/2021 06:55:43] "GET /banners/2.jpg HTTP/1.1" 200 -
Any help would be greatly appreciated. Happy to provide more details if necessary.
Thanks!
-Ash
Hello
We have the storedog deployed on GKE. Using the "ecommerce-workshop/deploy/gcp/gke/" ".yaml"'s except the datadog-agent.yaml. The frontend.yaml is configured like this.
But we cant see the RUM on Datadog. We are on the "datadoghq.eu" site.
It would be possible to add the "season replay" too?
Many thanks
Right now we're pointing to images built in personal accounts. Let's move this to a shared account for all learning labs, and put it in 1Password.
Depends on the attack box story #71 . The attack box should ssh in to one or more containers and modify a file on disk using vim or nano. Probably .ssh/authorized_keys ...
That container should be running the Datadog agent with file integrity monitoring enabled to empower the student to observe and analyze the finding.
Identify and implement one or more compliance bugs that can be placed into the ecommerce app. These could be unauthenticated APIs, TLS Misconfiguration, SSH Misconfiguration, etc. Feature flag these on for specific scenarios.
These will be used as part of the security foundations training scenario.
The Datadog agent supports ARM64 v8 so it would be handy to have ARM64 docker builds along side the x64 builds we ship now.
take a look at any of the docker-compose files. many of the images are set to pull latest. Can we go to a specific tag?
The yaml to load the discounts service is loading the discount-service container which is 10 months old instead of discounts which is 21 days old. Should it be?
Right now image links point to my personal Github account, let's update the docs so everything points to the proper place.
The current version is 0.54.2
but we used 0.4.1
and 0.50.0
src/broken-instrumented.patch:+gem 'ddtrace', '>= 0.4.1'
src/broken-instrumented.patch:+ ddtrace (0.50.0)
src/broken-instrumented.patch:+ ddtrace (>= 0.4.1)
src/instrumented-fixed.patch:+gem 'ddtrace', '>= 0.4.1'
src/instrumented-fixed.patch:+ ddtrace (0.50.0)
src/instrumented-fixed.patch:+ ddtrace (>= 0.4.1)
Currently, requirements.txt has ddtrace==0.46.0
and the latest version is 0.57.3
"db" is the name of the service we are using in docker-compose and k8s, but it should be configurable and not hardcoded, to make sure that those services will work in other environments
For now, the Docker image build process assumes we're using docker-compose's network names as shown here:
and here:
Expose these (now assumed) network routes as environment variables so we don't have to use docker-compose or build custom images with specific routes for workshops.
Hey team,
I would like to use this demo for a full monitoring workshop including RUM. Would it be possible to implement RUM?
As part of the setup it would be awesome to include as attribute of the session the user email or the amount of items in the cart as in this example:
https://docs.datadoghq.com/real_user_monitoring/installation/advanced_configuration/?tab=npm
Right now the storefront app has discounts address hardcoded to discounts:5001
and ads to advertisements:5002
. Ideally, both the address and ports should be configurable, as this is very specific to the current docker-compose & k8s deployment configuration
Seeing this error a lot from discounts and advertisements in a docker-compose environment. These services do send traces correctly, but something tries to connect to the agent on localhost
instead of datadog
, the agent hostname. This results in lots of errors in logs that are distracting.
ERROR:ddtrace.internal.writer:failed to send traces to Datadog Agent at http://localhost:8126, 2 additional messages skipped
Discounts listens by default on port 5001 and ads on port 5002, but those two ports are already used by the Datadog agent (https://docs.datadoghq.com/agent/guide/network/?tab=agentv6v7#open-ports)
So, when deploying these services in VMs instead of containers, they clash with the datadog agent.
Right now, the app is running in development mode.
This allows for live reloading within workshop environments, and makes it possible to edit code and instrument without needing to rebuild containers.
As the Kubernetes environment already uses prebuilt containers, we should add images that are running in the production Rails environment.
To make things a bit more agnostic with Kubernetes deployments, it will be helpful to package the ecommerce deployment story into a (potentially publishable?) HELM chart.
A few people have noted different env
settings depending on the service. The env
tag is explicitly set in the Ruby service, but not the downstream Python services.
Let's make sure every service has an explicitly set environment, so we have consistent services.
Attempting to deploy the stack locally using Docker Compose with the help of a Make command.
make local-start
Getting below errors:
2023-10-29 17:33:50 /usr/local/bundle/gems/puma-3.12.6/lib/puma/dsl.rb:43:in `read': No such file or directory @ rb_sysopen - config/puma.rb (Errno::ENOENT)
2023-10-29 17:33:50 from /usr/local/bundle/gems/puma-3.12.6/lib/puma/dsl.rb:43:in `_load_from'
2023-10-29 17:33:50 from /usr/local/bundle/gems/puma-3.12.6/lib/puma/configuration.rb:194:in `block in load'
2023-10-29 17:33:50 from /usr/local/bundle/gems/puma-3.12.6/lib/puma/configuration.rb:194:in `each'
2023-10-29 17:33:50 from /usr/local/bundle/gems/puma-3.12.6/lib/puma/configuration.rb:194:in `load'
2023-10-29 17:33:50 from /usr/local/bundle/gems/puma-3.12.6/lib/puma/launcher.rb:61:in `initialize'
2023-10-29 17:33:50 from /usr/local/bundle/gems/puma-3.12.6/lib/puma/cli.rb:71:in `new'
2023-10-29 17:33:50 from /usr/local/bundle/gems/puma-3.12.6/lib/puma/cli.rb:71:in `initialize'
2023-10-29 17:33:50 from /usr/local/bundle/gems/puma-3.12.6/bin/puma:8:in `new'
2023-10-29 17:33:50 from /usr/local/bundle/gems/puma-3.12.6/bin/puma:8:in `<top (required)>'
2023-10-29 17:33:50 from /usr/local/bundle/bin/puma:23:in `load'
2023-10-29 17:33:50 from /usr/local/bundle/bin/puma:23:in `<main>
In order for us to reliably verify changes to dependencies or code, we should get into the habit of CI testing the advertisements service via CI.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.