dynatrace / bosh-oneagent-release Goto Github PK
View Code? Open in Web Editor NEWBOSH release for Dynatrace OneAgent
License: MIT License
BOSH release for Dynatrace OneAgent
License: MIT License
Please provide the SHA for your releases to verify the integrity of the artifact.
We do not see any possibility to tell the dynatrace agent not to monitor short living processes, this is e.g. required for Spark, in which a vm start in order to execute a remote call. We do not gain any additional insights from monitoring these short living processes, so we would like to exclude these.
On other platforms we can make use of DT_INJECTION_RULES so we can define which processes are irrelevant for monitoring. There is currently no possibility with this plugin to pass the variable to the agent.
Hi, our pipeline that builds your release for bosh.io is failing for your latest release. Haven't dug into a lot, but it seems like maybe some of the blobs in your s3 release bucket aren't publicly readable?
Cloning into 'releases-index'...
done.
Checking out files: 100% (26058/26058), done.
[dynatrace-oneagent-1.0.2] skipping
[dynatrace-oneagent-1.0.3] skipping
[dynatrace-oneagent-1.0.4] skipping
[dynatrace-oneagent-1.1.0] skipping
[dynatrace-oneagent-1.2.0] skipping
[dynatrace-oneagent-1.2.1] skipping
[dynatrace-oneagent-1.2.2] skipping
[dynatrace-oneagent-1.3.0] skipping
[dynatrace-oneagent-1.3.1] skipping
[dynatrace-oneagent-1.3.2] skipping
[dynatrace-oneagent-1.3.3] skipping
[dynatrace-oneagent-1.4.0] importing
panic: Failed: Processing release: release=misc.Release{DirPath:"/tmp/build/6b387ced/release", MFPath:"/tmp/build/6b387ced/release/releases/dynatrace-oneagent/dynatrace-oneagent-1.4.0.yml", releaseReaderFactory:release.ReaderFactory{downloader:downloader.MuxDownloader{mux:map[string]downloader.Downloader{"http":downloader.HTTPDownloader{fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}, "https":downloader.HTTPDownloader{fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}, "file":downloader.LocalFSDownloader{fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}, "git":downloader.GitDownloader{fs:(*system.osFileSystem)(0xc0000acc40), runner:system.execCmdRunner{logger:(*logger.logger)(0xc0000b7c00)}, logger:(*logger.logger)(0xc0000b7c00)}}, logger:(*logger.logger)(0xc0000b7c00), downloadedPaths:map[string]downloader.Downloader{}}, extractor:tar.CmdExtractor{runner:system.execCmdRunner{logger:(*logger.logger)(0xc0000b7c00)}, fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}, fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}, jobReaderFactory:job.ReaderFactory{downloader:downloader.MuxDownloader{mux:map[string]downloader.Downloader{"http":downloader.HTTPDownloader{fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}, "https":downloader.HTTPDownloader{fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}, "file":downloader.LocalFSDownloader{fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}, "git":downloader.GitDownloader{fs:(*system.osFileSystem)(0xc0000acc40), runner:system.execCmdRunner{logger:(*logger.logger)(0xc0000b7c00)}, logger:(*logger.logger)(0xc0000b7c00)}}, logger:(*logger.logger)(0xc0000b7c00), downloadedPaths:map[string]downloader.Downloader{}}, extractor:tar.CmdExtractor{runner:system.execCmdRunner{logger:(*logger.logger)(0xc0000b7c00)}, fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}, fs:(*system.osFileSystem)(0xc0000acc40), logger:(*logger.logger)(0xc0000b7c00)}} Building tarball: executing bosh: exit status 1 (stdout: {
"Tables": null,
"Blocks": null,
"Lines": [
"-- Started downloading 'dynatrace-oneagent/d5cb01a64f257e8e3a4728364705070cd80a73e26bee4b5e7a3a6f6da89ef0ce' (sha1=sha256:90c2d1f560be79458ec6c432f5af1b4ee8b58e09a7247edc46f1e8754a3ebb7c)\n",
"-- Started downloading 'dynatrace-oneagent-windows/7719df91fdd16480626a78856e782df7608e8ef8336b342c59535c158f198f39' (sha1=sha256:44272c03a7f59d66edc24b3bdd4e65bc8ad7c3d8da4950ad8b57f9841415b4bb)\n",
"-- Failed downloading 'dynatrace-oneagent-windows/7719df91fdd16480626a78856e782df7608e8ef8336b342c59535c158f198f39' (sha1=sha256:44272c03a7f59d66edc24b3bdd4e65bc8ad7c3d8da4950ad8b57f9841415b4bb)\n",
"-- Failed downloading 'dynatrace-oneagent/d5cb01a64f257e8e3a4728364705070cd80a73e26bee4b5e7a3a6f6da89ef0ce' (sha1=sha256:90c2d1f560be79458ec6c432f5af1b4ee8b58e09a7247edc46f1e8754a3ebb7c)\n",
"- Downloading blob '6113ef44-4343-47e3-6751-69823a85eee1' with digest string 'sha256:44272c03a7f59d66edc24b3bdd4e65bc8ad7c3d8da4950ad8b57f9841415b4bb':\n Getting blob from inner blobstore:\n Getting blob from inner blobstore:\n AccessDenied: Access Denied\n\tstatus code: 403, request id: 3EB99292CCB1749C, host id: f9Hru+RR3omvYRHRmFD5ejhslzIpEreuGzSJOvYREyAS1ZwqBl88MX+fnuBl03HFEJsNf4QnfF8=\n- Downloading blob '130126bf-c1b7-44aa-5dcb-0835d9566f2b' with digest string 'sha256:90c2d1f560be79458ec6c432f5af1b4ee8b58e09a7247edc46f1e8754a3ebb7c':\n Getting blob from inner blobstore:\n Getting blob from inner blobstore:\n AccessDenied: Access Denied\n\tstatus code: 403, request id: 1BE70B51DBCD5211, host id: 351RDsjBQF3iuDyFwkHxBN0mXgSMqj4PpFCm6eTS0jvp8EljO59FrwC4fYPQIFCGpSkxb+AgTqc=",
"Exit code 1"
]
} stderr: )
goroutine 1 [running]:
main.main()
/tmp/build/6b387ced/worker/src/worker/create-releases.go:25 +0x250
exit status 2
Dear Dynatrace-Team
We currently have the Issue, where Dynatrace will block a Deployment-Process.
The Issue is caused, when the uninstall-script has been called or when the drain script did not complete successfully.
In this Case Monit will try to revive the OneAgent using the start-oneagent
-Script. However, since it does not contain an installer call, it will fail.
The affected VM will go into the “STOPPED” state. When trying to redeploy the following Error occurs:
Task 1234567 | 16:00:00 | Updating instance dummy: dummy/ffffffff-ffff-ffff-ffff-ffffffffffff (0) (canary) (00:02:00)
L Error: Action Failed get_task: Task ffffffff-ffff-ffff-ffff-ffffffffffff result: Stopping Monitored Services: Stopping services '[dynatrace-oneagent]' errored
Task 1234567 | 16:00:00 | Error: Action Failed get_task: Task ffffffff-ffff-ffff-ffff-ffffffffffff result: Stopping Monitored Services: Stopping services '[dynatrace-oneagent]' errored
You either have to force-delete the affected deployment and redeploy it afterwards, or manually ssh to the affected VM and call the pre-start-script manually.
An easy Solution would be to add a recovery-routine to the start-script.
Something likes this:
if ! runServiceCommand start; then
echo "error: Could not start Dynatrace OneAgent-Service"
echo "info: An attempt is made to repair the local Dynatrace OneAgent-Service"
if install_dynatrace; then
if ! runServiceCommand start; then
echo "error: Could not repair the local Dynatrace OneAgent-Service"
exit 1
fi
else
log "error: Could not repair the local Dynatrace OneAgent-Service"
exit 1
fi
fi
Where install_dynatrace
is the setup-function from the pre-start-script
Thanks, and best Regards
Christoph
Hi,
When we try to deploy the Addon to a Windows TKGi cluster, which is composed by Windows and Ubuntu VMs, Bosh tries to use the Shell scripts in monit config instead of the Powershell scripts.
Monit config file in Windows VM:
PS C:\var\vcap\jobs> cat .\dynatrace-oneagent\monit
check process dynatrace-oneagent
with pidfile /var/vcap/sys/run/dynatrace-oneagent/dynatrace-watchdog.pid
start program "/var/vcap/jobs/dynatrace-oneagent/bin/start-oneagent.sh"
with timeout 600 seconds
stop program "/var/vcap/jobs/dynatrace-oneagent/bin/stop-oneagent.sh"
with timeout 120 seconds
group vcap
Example of how a monit file should look like in Windows VMs:
PS C:\var\vcap\jobs> cat .\kubelet-windows\monit
{
"processes": [
{
"name": "kubelet",
"executable": "powershell",
"args": ["C:\\var\\vcap\\jobs\\kubelet-windows\\bin\\kubelet_ctl.ps1"],
"env": {}
}
]
}
Error:
Configuring job dynatrace-oneagent: Adding monit configuration: invalid character 'c' looking for beginning of value
The error is because the monit config file in the Windows VM is not a JSON.
We have runtime configurations for both Ubuntu Xenial and Windows2019 stemcells set in Opsmanager like in the example.
Bosh runtime config for Windows Addon:
releases:
- name: dynatrace-oneagent
version: 1.4.0
addons:
- name: dynatrace-oneagent-sandbox2-windows-addon
jobs:
- name: dynatrace-oneagent
release: dynatrace-oneagent
properties:
dynatrace:
environmentid: <redacted>
apitoken: <redacted>
apiurl: <redacted>
hostgroup: sevice-instance_b48b90ba-5b07-4a03-9ee4-16801560eb0d
hosttags: cluster=TANZU_SANDBOX2 landscape=Tanzu_LS team=Tanzu_T
hostprops: Department=Infrastructure Stage=Sandbox
infraonly: 0
include:
deployments:
- service-instance_7adea55a-5905-4768-85f9-2146c802c573
stemcell:
- os: windows2019
exclude:
lifecycle: errand
As we were working on bringing infraonly
flag for a bosh deployment we came across a quirk.
When we enable infraonly
mode and try to switch it back to full-stack the bosh agent is not willing to do it. Is this the intended behaviour?
Switching from full-stack
to infraonly
works. However, it cannot switch back to full-stack
mode gain. It seems like this switch can happen only when the bosh VM is recreated.
On trying to understand what is going on we did an analysis and came across a few observations:
infraonly.conf
which gets initialized when the flag is enabled.We tried using both the flags: infraonly
and INSTALLERARGS: INFRA_ONLY=1
What we noticed that within the installer file located /var/vcap/packages/Dynatrace-OneAgent-Linux-x.xxx.sh
file there was a logic that sets this file and it seemed to be a onetime initialisation activity.
The next time the agent would reset, only when the VM is recreated. (I.E when the config files are no longer present in the /var/lib location)
This makes it difficult to use it in production environments when we need to make switches.
Hey,
Could you please add the directories .final_builds
and releases
to the github repository? This is necessary for bosh.io to pick it up and display it here http://bosh.io/releases/github.com/dynatrace-innovationlab/bosh-oneagent-release?all=1
Hello,
after updating the release version from 1.3.2 to 1.3.3, our proxy setup stopped working. Once the version was reverted to 1.3.2 the issue was fixed.
sha1: fd97b6068e76e5b3265b9cfd5a121e6a7bb26a37
url: https://bosh.io/d/github.com/Dynatrace/bosh-oneagent-release?v=1.3.3
version: 1.3.3
We saw the following error in the install log:
11:04:18 Dynatrace OneAgent failed to connect to Dynatrace Cluster Node https://dynatrace.random:8443/communication. See log file for details: /var/vcap/data/dynatrace/oneagent/log/os/ruxitagent_host_14851.0.log Installation finished
And this is ruxitagent_host_14851.0.log
2020-10-28 11:10:29.182 UTC [e88dd8c8] info [native] Storage path ................ /var/vcap/data/dynatrace/oneagent
2020-10-28 11:10:29.182 UTC [e88dd8c8] info [native] Tenant UUID ................. 02ed582b-10d4-4dde-895b-98f18918cdbe
2020-10-28 11:10:29.182 UTC [e88dd8c8] info [native] Tenant ID ................... 0xcc09a1cd
2020-10-28 11:10:29.182 UTC [e88dd8c8] info [native] Network zone ................
2020-10-28 11:10:29.182 UTC [e88dd8c8] info [native] Agent ID .................... 0x31dcd25339b2cbb1
2020-10-28 11:10:29.182 UTC [e88dd8c8] info [native] Process group ID ............ 0x2e81c7cd12261c61
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] OSI ID ...................... 0x99ae3242cb44636d
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Node ID ..................... 0x0000000000000000
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Process group instance ID ... 0xb8cbc19f0dba6303
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Container group ID .......... 0x0000000000000000
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Container group instance ID . 0x0000000000000000
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Container ID ................
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Agent host .................. 2d2397f0-d596-42ed-a12c-470efb4cd0bf
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Injection mode .............. UNKNOWN
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Standalone .................. no
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Log file aging .............. disabled
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Agent name .................. host
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Server/Collector ............ https://dynatrace.random:8443/communication;https://cluster-activegate.dynatrace.random:9999/communication
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Proxy .......................
2020-10-28 11:10:29.183 UTC [e88dd8c8] info [native] Dispatcher buffersize ....... 419430400
Hi,
At the moment, the pre-start-Script calls the Installer without checking the Return code afterwards.
(See https://github.com/Dynatrace/bosh-oneagent-release/blob/master/jobs/dynatrace-oneagent/templates/pre-start.erb#L165)
As a result, valuable information about the error is lost.
In the following example, a failed download of the Installer causes the installation to fail.
The Installer returns the code 7. However, instead of trying to download the Installer again, the Pre-start script will continue.
…
/opt/dynatrace/oneagent is on /var/vcap/data
Extracting...
Warning: S/MIME signature is missing
Unpacking. This may take a few minutes...
Error: Archive is corrupted. Installation aborted.
Setting oneagentwatchdog pid
Installation finished
Best regards
Christoph
it appears that both job specs include packages:
in both of the jobs. this translates into packages being nil
; however, packages is expected to be an array of strings. this error causes bosh.io not to be able to pick new release.
cc @voelzmo
Hi Team,
are there plans to test/validate the use of the Dynatrace bosh release with the upcoming Bionic stemcells (April 2021)?
We are facing a situation in which the nginx stops responding to routes specified by https://github.com/cloudfoundry/cloud_controller_ng/blob/e7e6ed316a89bb578ecec0aedc88fb61b8fe362c/bosh/jobs/cloud_controller_ng/templates/nginx.conf.erb#L191, when restarted. This behavior is not seen if the addon is not installed. The requests we notice fail need to have a
multipart/form-data header
This are not responded to by nginx:
$ curl -H"Content-Type: multipart/form-data; boundary=o" -XPUT -d "foo" https://URL/v2/apps/70964c35-045e-4b0e-bed9-c1794ebbff1/bits?async=true
But this one is:
$ curl -H"Content-Type: multipart/form-data; boundary=o" -XPUT -d "foo" https://URL/v2/apps/70964c35-045e-4b0e-bed9-c1794ebbff1/something_else?async=true
And this one is:
$ curl -XPUT -d "foo" https://URL/v2/apps/70964c35-045e-4b0e-bed9-c1794ebbff1/bits?async=true
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.