elastic / apm-server Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://www.elastic.co/guide/en/apm/guide/current/index.html
License: Other
Home Page: https://www.elastic.co/guide/en/apm/guide/current/index.html
License: Other
Request for official definitions for these in the docs. There seems to be some variation depending on the agent. This is more useful for messaging and when communicating in the field.
To reproduce on macOS Sierra 10.12.6
# in /Users/pokus/golang/src/github.com/elastic/apm-server
$ make update
bash: virtualenv: command not found
make: *** [python-env] Error 127
Although $ make
worked and produced the apm-server
.
If this is an issue with python I have python2 -> ../Cellar/python/2.7.13_1/bin/python2
in /usr/local/bin
and python2.7 -> ../../System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7
in /usr/bin
.
$PATH
is /Users/pokus/.nvm/versions/node/v6.11.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/opt/go/libexec/bin
Not sure what to do.
edit: am I missing virtualenv
?
apm-server should respond to OPTIONS and POST requests with correct access control headers:
Response to OPTIONS should include these
Access-Control-Allow-Origin "*";
Access-Control-Allow-Headers "Content-Type";
Access-Control-Allow-Methods "POST";
Note: We have to update Access-Control-Allow-Headers if we choose to use custom headers on the agent side but for now the above should be enough.
And the response to the original POST request should include:
Access-Control-Allow-Origin "*";
In production mode Access-Control-Allow-Origin should be configured by the user to something other than "*"
Docs:
https://fetch.spec.whatwg.org/
https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS
Logstash and Beats agreed on a naming scheme. We should follow the same. See here for more details: elastic/beats#4984
@simianhacker FYI
Currently, our default port is 8080
. This is a port that is often used for local development of web services, so there's a high probability that we get a clash if people run apm-server
locally with the default settings.
To optimize for the "getting started" scenario, we should choose a port that is
The current stacktrace.json
has both a wrong name and description. Rather, it should be frame.json
.
This is a remnant from before we simplified the stacktrace object to be a simple list of frames.
git_ref
is currently not used anywhere, but still defined as an indexed field.
Libbeat offers to possiblity to register metrics (see example here: https://github.com/elastic/beats/blob/master/filebeat/prospector/log/harvester.go#L40). These metrics can then be logged every x seconds or also be used for monitoring. We should add the following metrics. Please comment with additional metrics we should add:
We should include instructions on how to install APM Server from the Elastic APT/YUM repositories similar to https://www.elastic.co/guide/en/beats/filebeat/current/setup-repositories.html
[root@soc apm-server-6.0.0-rc1-linux-x86_64]#npm install elastic-apm --save
npm WARN saveError ENOENT: no such file or directory, open '/opt/elasticsearch-apm/apm-server-6.0.0-rc1-linux-x86_64/package.json'
/opt/elasticsearch-apm/apm-server-6.0.0-rc1-linux-x86_64
└─┬ [email protected]
├── [email protected]
├── [email protected]
├── [email protected]
├── [email protected]
├─┬ [email protected]
│ └── [email protected]
├── [email protected]
├─┬ [email protected]
│ └─┬ [email protected]
│ └── [email protected]
├── [email protected]
├─┬ [email protected]
│ └── [email protected]
├── [email protected]
├─┬ [email protected]
│ ├── [email protected]
│ └── [email protected]
├── [email protected]
├── [email protected]
├─┬ [email protected]
│ ├── [email protected]
│ └── [email protected]
├─┬ [email protected]
│ ├── [email protected]
│ └─┬ [email protected]
│ └── [email protected]
├── [email protected]
├── [email protected]
├─┬ [email protected]
│ ├─┬ [email protected]
│ │ └─┬ [email protected]
│ │ ├── [email protected]
│ │ └── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├─┬ [email protected]
│ │ ├── [email protected]
│ │ └── [email protected]
│ └── [email protected]
├─┬ [email protected]
│ ├─┬ [email protected]
│ │ └─┬ [email protected]
│ │ └── [email protected]
│ └── [email protected]
└── [email protected]
npm WARN enoent ENOENT: no such file or directory, open '/opt/elasticsearch-apm/apm-server-6.0.0-rc1-linux-x86_64/package.json'
npm WARN apm-server-6.0.0-rc1-linux-x86_64 No description
npm WARN apm-server-6.0.0-rc1-linux-x86_64 No repository field.
npm WARN apm-server-6.0.0-rc1-linux-x86_64 No README data
npm WARN apm-server-6.0.0-rc1-linux-x86_64 No license field.
This is a meta issue for collecting documentation related issues that need to be resolved.
We should log a warning if the APM Server has a secret token set without SSL being enabled.
Sending the secret token over clear text HTTP is pointless and dangerous.
Alternatively the APM Server could refuse to start entirely in this scenario, but users might still want to terminate SSL at a upstream (for example, at the load balancer), so I vote for logging a warning instead of refusing to start.
Unify naming to make it easier to find fields for users.
An error in Transform
makes server to return a 500 error:
https://github.com/elastic/apm-server/blob/master/beater/server.go#L137
This has 2 issues:
Semantic: definition of 500 is "The server failed to fulfil a valid request", but in our specific case a potential error returned by Transform
would happen with an invalid request.
Not testable: https://github.com/elastic/apm-server/blob/master/beater/server.go#L139 is not tested, and we don't know under which conditions might be executed, so we can not even test it end to end (without mocks).
After loading the Dashboards under Kibana 5.6, the following error is shown:
Visualize: [parsing_exception] [query] query malformed, no start_object after query name, with { line=1 & col=36 }
The same dashboards work under 6.x and 7.x
@jjfalling brought up that it might be "{"language":"lucene","query":""}"
in the visualizations which breaks the elements.
make package
fails unless fields.yml
was already created, typically by make update
:
Execute /scripts/before_build.sh github.com/elastic/apm-server ./_beats
cp: cannot stat 'fields.yml': No such file or directory
make[1]: *** [prepare-package-cgo] Error 1
make: *** [package] Error 2
Short term fix could be to just document this requirement.
The name GetProcessors sounds more like a java getter and might not be idiomatic go.For reference this and this.
The method GetProcessors can be renamed to Processors and GetProcessor to Processor.
I can make this change if you think this is sensible.
P.S : Currently the methods on the register are not doing any work.They look like java pojos.
follow on from #264
the docs link in the packaged Readme is invalid for all packages, as it is also built with the standard beats script.
If an agent sends a timestamp with eg. month and day swapped:
#345 changed transaction_id to transaction.id. That means we need to update the dashboards as well.
We should discuss if we want to generate seperate indices for errors
and for transactions
. The documents are semantically different and splitting up indices would allow easier handling in some cases.
Request Per Minute [APM] should use transaction.result as the filter field for the status code (or APM is sending the wrong field ;) )
We have to kinds of name conflicts right now:
Event
, which is a type in processors
and in libbit
, and we call event
to both things independently.
Each processor has an Event
and a Payload
. I didn't notice this in the beginning, but this means that we have 2 payload_test
files and 2 event_test
files (for now, potentially this will happen for each processor).
When a test in any of those files fails, I don't have a way to know which actual file is failing, and I need to rerun one first and then the other one to figure out, which is not optimal.
It is also more complicated to open that file in the editor (I need to look at the full path) and to know what exact type of event
and payload
I have (I need to look at the namespace).
Personally, repeated names for different types make my work a bit harder than it needs to be. I would prefer to have different names for different things whenever possible (and I think in this case it is possible).
Should be mandatory, as follow up on Issue #214 .
This might be similar to this issue.
Looks like the processor.go in error and transaction package are no different and looks like we are having redundancy.
Either we can use go embedding or refactor in a better way.Will update with some ideas.
Initial work to allow this has started here: elastic/beats#5110
In #32 TLS was implemented. This is a follow up issue to track some additional changes which should be investigated or made:
Enabled()
flagFor database traces we currently only have a single standardized context
property called sql
:
apm-server/docs/data/intake-api/generated/transaction/payload.json
Lines 145 to 147 in a0dd90c
It have previously been discussed how to generalize this to support other type of queries than just SQL. No conclusion was made, so we stuck with the current implementation for now.
The Open Tracing spec have thought about this as well, and I think their solution is pretty nice
"context": {
"db": {
"instance": "customers",
"statement": "SELECT * FROM product_types WHERE user_id=?",
"type": "sql",
"user": "readonly_user"
}
}
Inspired by the Open Tracing Semantic Conventions Span tags table
I suggest we use this format, which is also easily extensible.
Following this #291 (comment)
We'll need to move this field again, unfortunately.
This is the reason why we want the transaction id in the same place across documents:
If i want to drill down on a transaction and see its associated traces i can’t do this in a single dashboard right now without a scripted field
I just did a little test to see what type of data the server responds with in case of an issue. I tried to trigger all types of errors below 5xx (didn't have a proper way of generating a 5xx response).
All the responses I saw used the correct HTTP status code. That's good. But all responses used text/plain
and not application/json
as I would have expected.
For an HTTP API that's primarily consumed programmatically, we should prioritise responding with a format that's easily parsable by a computer. Below is how our respones look today.
Invalid URL (result: 404)
% curl localhost:8080/invalid
404 page not found
Invalid HTTP method (result: 405)
% curl localhost:8080/v1/errors
Only post requests are supported
Invalid Content-Type
(result: 400)
% curl -X POST -H 'Content-Type: text/plain' localhost:8080/v1/errors
Decoding error: invalid content type: text/plain
Data validation error (result: 400)
% curl -X POST -H 'Content-Type: application/json' -d '{"app":{"name":"foo","agent":{"name":"foo","version":"1"}},"errors":[{}]}' localhost:8080/v1/errors
Data Validation error: Problem validating JSON document against schema: I[#] S[#] doesn't validate with "error#"
I[#/errors/0] S[#/properties/errors/items/anyOf] anyOf failed
I[#/errors/0] S[#/properties/errors/items/anyOf/0/required] missing properties: "exception"
I[#/errors/0] S[#/properties/errors/items/anyOf/1/required] missing properties: "log"%
Notice how the 400 respones repeats the message twice but uses a slightly different wording, e.g. both "Data Validation error" and "Problem validating JSON document against schema". I suggest we just use the former.
Personally I always respond with an empty body in for instance 404 and 405 responses - as no extra info is needed. But since curl by default doesn't display the status code, there is a certain merit in displaying some text to a human for debugging purposes. So we can choose to continue responding with a body if it helps - but I'm also fine with just leaving the body empty in that case. We could also let this depend on the Accept
header (curl will set Accept: */*
by default, but our agents could set Accept: application/json
)
The JSON Schema validation errors seem to contain line breaks and other special formatting that I expect comes directly from the JSON validator. It would be nice if this could be cleaned up in some way.
We had previously talked about always responding with JSON, e.g:
{"message": "foo bar baz"}
What do you think about implementing this? Would love to hear pros/cons 😃
I'd like to revisit the way in which we initialize processors. This has been discussed before, but I think circumstances (and general knowledge about the project) changed over the last months so IMO this worths a second look. Ill try to explain myself as best I can.
Right now, each processor registers itself in an init
function, and we make sure those functions are executed with a make command ( https://github.com/elastic/apm-server/blob/master/Makefile#L39 ) that generates a go
file just with blank imports, and then we blank-import that file wherever is needed:
Line 11 in 25d1d92
apm-server/beater/beater_test.go
Line 12 in 5f73b75
The original goal of this was to support arbitrary plugins, so that anyone could "plug" in a processor without having to worry about the rest. This is inspired by other beats.
However there are a few caveats applying this approach to the apm server:
Since the apm server is not self-contained, you can't be oblivious to the agents and the tailored UI. The server is in middle of a 3-steps contract. You can't just "plug" a processor and expect to it work without considering how might impact other agents, how to spec it out, how to test it end-to-end, what happens to the curated dashboards, etc. You need the big picture no matter what.
Is hard to picture these plugins. I expect the biggest room for growth in APM is by adding agent support for more languages and frameworks, rather than customized features in the server.
But even that, features in the apm server don't necessarily fit the processor
pattern of "ingest data over HTTP, then pipe it to ES". Consider for instance sourcemaps or the onboarding document...
Even if we want to facilitate plugin development, we should make an use-case based effort to address them so we have a more concrete idea of what do they entail, how they work, what value they provide, etc.; instead of prematurely lay out the codebase in certain way to accommodate some (more or less blurry) expectations.
That said, I found the processor initialization to be a problem while developing #227
I wanted to create eg. a frontend/transaction
package that could reuse transaction
stuff, but the empty include trick wouldn't pick it up. This limits me how to write processors.
I got stuck with some failing tests for quite some time because i forgot to run make update
first so to import the right thing (aka to trigger the right side effect). This is very counter-intuitive, you just have to know it. At some point I got myself into a situation wether either make update
worked and make unit
didn't, or the other way around.
Admittedly, blank identifier imports are hacky. A command that generates a go
file with just blank imports is even more hacky. Exceptions to the rule on already hacky code is definitely a concern:
apm-server/script/generate_imports.py
Line 26 in d9e691f
Another concern is the bug that we saw in the past about mixed tracing data. This was because processors were holding state, while only one processor instance for each endpoint is created for the entire life-cycle of the server in the init
method.
It is very easy and tempting to add state to processors (they are just structs), without realising how dangerous can it be, and I fear it can happen again.
TL;DR
I think we can simplify a lot the code, remove all this workarounds / side effects and have safer processors just by registering them centrally in the beater
.
This way you can have all the endpoint-processor-handlers mapping in the same one place, and the implementation details (what processors do) are left in the processor
packages.
I don't think that facilitating plugins development is a pressing need, and we can always tackle it in due time.
My attempt to solve this is at #229
The apm server helpfully spits out great validation error messages if a payload fails the JSON Schema validation:
2017/09/13 09:12:20.138559 server.go:164: INFO Data validation error: Problem validating JSON document against schema: I[#] S[#] doesn't validate with "error#"
I[#/errors/0/log] S[#/properties/errors/items/properties/log/required] missing properties: "message", code=400
This is great, but all the agent gets back is an empty 400 Bad Request
response. It would be great if the validation error message could be returned to the agent, e.g. as a JSON payload.
While there is some support for character encoding in headers, anything but ASCII/Latin1 should be avoided due to possible interoperability issues.
As we include the secret token in the Authorization
header, this impacts us. We should either limit the token to the ASCII character space, or introduce an encoding ourselves, e.g. base64.
If we go with base64, we can either decide to Just Do It ™️, without any regards to backwards compatibility (which might be a concern, as this won't make it into alpha1), or use a prefix (e.g. base64:
to indicate wether a token is encoded or not.
Documentation describes setting the index pattern like this:
output.elasticsearch:
index: "apm-%{[app.name]}-%{[beat.version]}-%{+yyyy.MM.dd}"
here https://www.elastic.co/guide/en/apm/server/current/configuring.html#index-pattern
but changing it make APM Server print:
2017/11/28 14:03:39.259288 beat.go:635: CRIT Exiting: setup.template.name and setup.template.pattern have to be set if index name is modified.
We should update the documentation to explain that these also need to be set and how to set these.
The context.app.language.version
field is currently required if context.app.language.name
is set.
After an internal discussion we have decided that it should be optional.
That will allow @watson to implement the name
field for the Node.js agent (elastic/apm-agent-nodejs#6).
It would be really useful to collect source host / IP address from agents on every request. Apart from having the info, this feature would allow add_kubernetes_metadata
to match originating pod and enrich events with metadata from Kubernetes.
I've been playing with that idea and did this:
https://github.com/elastic/apm-server/compare/master...exekias:system-ip?expand=1
along with these settings:
processors:
- add_kubernetes_metadata:
indexers:
- ip_port:
matchers:
- fields:
lookup_fields: ["context.system.ip"]
so far it works.
The fields.yml
file is used as a base to create documentation, kibana index-patterns and indexing templates for ES.
Not all options that are allowed by Elasticsearch are implemented for index template creation right now. Also there are some limitations per mapping param that might not be that obvious. Eg
enabled:false
can only be used for mapping types
and object
fields, see official doc.enabled
and index
.It would be nice if we could document all known fields, with additional information if the field is indexed, searchable, etc. For this we need to implement additional logic in beats, see elastic/beats#4853
Right now we have set dynamic: false
only for context
. This means that no additional attributes are indexed dynamically. In case we don't want to automatically index all other fields we either have to implement proper functionality for the index template creation or also set dynamic: false
for other objects.
@gingerwizard brought up an interesting problem. In a purposely-made inefficient view, he queries a list of 15000 items one by one instead of a single query, to demonstrate a problem that can be detected and analyzed with APM.
The problem is, due to the huge amount of traces, the payload surpasses our payload size limit, and the transaction is lost.
In the Opbeat agents, we did some optimization by only storing the stacktrace once, but that's not done in the new Elastic APM agents.
A very simplistic approach could be to check if the just-ending trace is equivalent (same trace name, same stack trace, maybe even same context) with the preceding trace. If yes, increment a counter field, and add the duration to the preceding trace, don't create a new trace.
Even this simplistic approach has some drawbacks though. We lose some information (the duration of each individual repetition), and the UI would need to expose the count somehow. Also, things like "Top N queries" would get more complicated to implement.
Any ideas?
This meta issue should help to discuss and track refactorings which we want to do in apm-server to improve the code base. Each person should update and edit the tasks on this list.
@gingerwizard pointed out that it would be nice to have the field transaction_id
always be called transaction_id
across our output model. Today we have transaction.id
and trace.transaction_id
. Let's change it so it's transaction.transaction_id
instead.
This make it easier to filter for a specific transaction for example in a dashboard and show the transaction, all the traces etc. with a single filter
Could you please add automated builds to docker hub.
Kibana version: 6.0.0-rc2
Elasticsearch version: 6.0.0-rc2
Server OS version: CentOS
Browser version: Chrome
Original Kibana install method (e.g. download page, yum, from source, etc.): rpm packages
Description of the problem including expected versus actual behavior:
The APM Dashboards that get optionally loaded as part of the setup have the Last 24 hours
timespan saved in them. So if you're looking at some other timespan of data anywhere in Kibana and open an APM dashboard, it changes the timepicker to Last 24 hours
. It gets annoying after a few times if you're looking at data further in the past.
Any dashboard can be saved with its Store time with dashboard
set, but I don't think any of the beats dashboards have the time stored with them so this seems unusual.
Since there's only 5 [APM] dashboards, it's pretty easy to just click Edit, uncheck the Store time with dashboard
, and Click Save.
Or maybe there really is a reason that APM users would only ever want to see the last 24 hours and the Dashboards are fine as-is.
Steps to reproduce:
Last 24 hours
Are there any plans to create a java agent to report java internal metrics (jvm memory, jmx, threads, ...) and stuff like backend calls (webservices, database, ...)
Related products:
I'm compiling a list of global issues that need to be cleaned up in the docs:
Example 1:
Use asciidoc attributes to resolve book paths.
To import the attributes, add the following lines to the index file (usually called index.asciidoc):
:branch: master
include::{asciidoc-dir}/../../shared/attributes.asciidoc[]
Set branch
to the correct branch for your project.
Then you can use any attributes in the shared attributes file to resolve the book path.
For example {ref}/getting-started.html
resolves as https://www.elastic.co/guide/en/elasticsearch/reference/master/getting-started.html
in the generated html files.
A word of caution: If you make a typo or use an attribute that doesn't exist, the build won't warn you. In fact, it will delete the entire line of text after the attribute. So it's a good idea to verify that your links resolve correctly in the generated html.
Example 2:
You can use an asciidoc include to pull the contents of the JSON file into the document and then use code tags to format the JSON:
=== Transaction document
[source,json]
----
include::./data/elasticsearch/transaction.json[]
----
Note that you'll also need to update the conf.yaml so the doc build knows to look in this directory for files.
Example 3:
Specify an ID (anchor) for linking and to set the name of the generated html file.
[[apm-security]]
=== Security
With the [[ID]]
specified, you get https://www.elastic.co/guide/en/apm/server/current/apm-security.html
Without it specified, you get https://www.elastic.co/guide/en/apm/server/current/_security.html. If you change the title, the URL changes...and stuff breaks.
Minor issue, but after installing the deb package:
# systemctl status apm-server
● apm-server.service - apm-server
Loaded: loaded (/lib/systemd/system/apm-server.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2017-10-30 11:25:09 UTC; 9min ago
Docs: https://www.elastic.co/guide/en/beats/apm-server/current/index.html
Main PID: 5106 (apm-server)
CGroup: /system.slice/apm-server.service
└─5106 /usr/share/apm-server/bin/apm-server -c /etc/apm-server/apm-server.yml -path.home /usr/share/apm-se
Oct 30 11:25:09 ubuntu-xenial systemd[1]: Started apm-server.
The "docs" link (https://www.elastic.co/guide/en/beats/apm-server/current/index.htm) is invalid and seems to be a leak from the Beats packaging system. I'm opening the ticket here for better visibility, but the fix might be needed in the Beats repo.
Enforce validations for custom
and tags
, as these attributes are user defined key:value
pairs.
tags
key validation: already implemented; after checking ES and querying with Kibana special characters are not an issue, but to avoid ambiguous meaning of .
and *
and avoid having to unescape "
, those chars are restricted in key-names.custom
key validation: implement according to tags
key validationscustom
values: no restrictions, can be anything as values won't be indexedtags
values: only strings
allowed as those values are indexedHaving the APM Server installed and running is critical to the APM experience, which is why in terms of both onboarding and continued status, it would great to have a status provided, perhaps into Elasticsearch which can be queried, that it's setup and it's running.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.