Git Product home page Git Product logo

elastic-agent's People

Contributors

adriansr avatar aleksmaus avatar andersonq avatar andrewkroh avatar andrewvc avatar apmmachine avatar blakerouse avatar chrsmark avatar cmacknz avatar dedemorton avatar exekias avatar faec avatar fearful-symmetry avatar jsoriano avatar kaiyan-sheng avatar kuisathaverat avatar kvch avatar leehinman avatar marc-gr avatar michalpristas avatar monicasarbu avatar narph avatar p1llus avatar ph avatar ruflin avatar sayden avatar tsg avatar v1v avatar vjsamuel avatar ycombinator avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elastic-agent's Issues

[Elastic Agent][Discuss] elastic-agent.yml overwrite needed?

Today when the Elastic Agent is enrolled into Fleet the elastic-agent.yml file is backed up and state is written into the elastic-agent.yml file that the agent is managed. In addition, the Agent writes fleet.yml with more data. On startup, the elastic-agent.yml is checked to see if the agent is enrolled. This caused some issues in Cloud initially because it was overwritten.

This issue is to have a more general discussions around the purposes of the files. Do we need to write a state to elastic-agent.yml when enrolled? Is fleet.yml enough? How do we exactly use each file. The goal is to come up with a guideline to make sure future development is aligned with this guideline.

[Agent] How we could reduce the need for root privileges for beats.

Original comment by @ph:

The Agent starts the beats process with the same user as the agent process which means root. This is less than ideal if we want to lock down the process and reduce the risk.

TODO:
Define stories

  • Behavior of Metricbeat
  • Behavior of Auditbeat
  • Behavior of Packetbeat.

Documentation calls for "elastic-agent" but the service name installed is "Elastic Agent"

When checking docs for windows installation of the elastic agent, for both start and stop cmds for Windows check for elastic-agent service.
https://www.elastic.co/guide/en/fleet/current/start-elastic-agent.html
https://www.elastic.co/guide/en/fleet/current/stop-elastic-agent.html

If all the steps from the doc have been followed during installation the name of the service will be Elastic Agent instead.
So Stop-Service elastic-agent and Start-Service elastic-agent will fail.
Source code https://github.com/elastic/beats/blob/master/x-pack/elastic-agent/pkg/agent/install/paths_windows.go#L20 confirms the name of the agent as well.
Documentation should be updated.

For confirmed bugs, please report:

  • Operating System: Windows

Refactor: Use the capabilities to defined if an Agent is upgradable or not

Today we can use https://github.com/elastic/beats/blob/64f70785c0911eeb6f3f6ce5264f61544844ca0f/x-pack/elastic-agent/pkg/agent/application/upgrade/upgrade.go#L78 to define if a release is upgradable or not. We have added the concept of capabilties in elastic/beats#23848

We should if to the capabitilies files, it could look like this:

capabilities:
-  upgrade: false

@ruflin @mostlyjason WDYT if make it generic on the action that we support https://github.com/elastic/beats/tree/1f1fae56057dce0604f72f2cf0099f9a6f2b75aa/x-pack/elastic-agent/pkg/agent/application/pipeline/actions/handlers?

capabilities:
- rule: deny
  action: Upgrade

[Elastic Agent] Support Kafka as an output for elastic agent

Describe the enhancement:
Bringing the Elastic Agent more in line with outputs supported by Beats.

Describe a specific use case for the enhancement or feature:

Enable customers who are using beats to send events/logs to a Kafka broker to be able to create the same environment and functionality using the Elastic Agent. Lack of this capability may be an inhibitor for the adoption of Elastic Agent.

Fleet installation script fails to detect error in service start

Description
elastic-agent install fails to detect a problem in service start and report misleading message: Installation was successful and Elastic Agent is running, even though service hasn't been able to start (ie: due to a process already binded in port 6789)

Script should at least notify that agent was installed but there was a problem starting the service.

How to reproduce the bug

  1. Process already running in localhost:6789. ie:
# netstat -natop | grep 6789
tcp6       0      0 :::6789                 :::*       LISTEN      1891/docker-proxy    off (0.00/0/0)

  1. Run the elastic-agent command in CLI
ubuntu@server:~$ sudo ./elastic-agent install -f --kibana-url=https://<URL> --enrollment-token=<token>
The Elastic Agent is currently in BETA and should not be used in production

2020-12-03T16:43:31.069+0100	DEBUG	kibana/client.go:170	Request method: POST, path: /api/fleet/agents/enroll
Successfully enrolled the Elastic Agent.
Installation was successful and Elastic Agent is running.

Installation script reports Installation was successful and Elastic Agent is running. but agent is never enrolled in Kibana Fleet UI

  1. Checking the output of journalctl -u elastic-agent.service we can see the process wasn't able to start due to the address already in use
#  journalctl -u elastic-agent.service
-- Logs begin at Sat 2020-08-29 18:15:02 CEST, end at Thu 2020-12-03 16:43:51 CET. --
nov 17 16:25:53 server systemd[1]: Stopped Elastic Agent is a unified agent to observe, monitor and protect your system..
nov 17 16:25:53 server systemd[1]: Started Elastic Agent is a unified agent to observe, monitor and protect your system..
nov 17 16:25:53 server elastic-agent[1514327]: starting GRPC listener: listen tcp 127.0.0.1:6789: bind: address already in use
nov 17 16:25:53 server systemd[1]: elastic-agent.service: Main process exited, code=exited, status=1/FAILURE
nov 17 16:25:53 server systemd[1]: elastic-agent.service: Failed with result 'exit-code'.
nov 17 16:27:53 server systemd[1]: elastic-agent.service: Scheduled restart job, restart counter is at 2.
nov 17 16:27:53 server systemd[1]: Stopped Elastic Agent is a unified agent to observe, monitor and protect your system..
nov 17 16:27:53 server systemd[1]: Started Elastic Agent is a unified agent to observe, monitor and protect your system..
nov 17 16:27:54 server elastic-agent[1514463]: starting GRPC listener: listen tcp 127.0.0.1:6789: bind: address already in use
nov 17 16:27:54 server systemd[1]: elastic-agent.service: Main process exited, code=exited, status=1/FAILURE
nov 17 16:27:54 server systemd[1]: elastic-agent.service: Failed with result 'exit-code'.
...

Workaround
We can change the default port in /opt/Elastic/Agent/elastic-agent.yml from 6789 to ie 16789:

fleet:
  enabled: true
agent.grpc:
  address: localhost
  port: 16789

And then restart the service and check that service is up:

# sudo systemctl start elastic-agent.service
# 
# sudo  journalctl -u elastic-agent.service -f
-- Logs begin at Sat 2020-08-29 18:15:02 CEST. --
dic 03 16:53:20 server systemd[1]: Started Elastic Agent is a unified agent to observe, monitor and protect your system..

[Elastic Agent][Discuss] elastic-agent.yml overwrite needed?

Today when the Elastic Agent is enrolled into Fleet the elastic-agent.yml file is backed up and state is written into the elastic-agent.yml file that the agent is managed. In addition, the Agent writes fleet.yml with more data. On startup, the elastic-agent.yml is checked to see if the agent is enrolled. This caused some issues in Cloud initially because it was overwritten.

This issue is to have a more general discussions around the purposes of the files. Do we need to write a state to elastic-agent.yml when enrolled? Is fleet.yml enough? How do we exactly use each file. The goal is to come up with a guideline to make sure future development is aligned with this guideline.

[Fleet] Agent fails when host disk space is full, need better support

A user brought this to us and I am logging a quick ticket to capture minimal details.

Apparently the Agent failed to install metricbeat, due to a lack of disk space.

It wasn't clear immediately, but some subsequent log diving shows the reason:

/var/lib/elastic-agent/logs/elastic-agent-json.log.2:{"log.level":"error","@timestamp":"2020-11-11T11:13:12.997-0500","log.origin":{"file.name":"log/reporter.go","file.line":36},"message":"2020-11-11T11:13:12-05:00: type: 'ERROR': sub_type: 'FAILED' message: Application: filebeat--7.9.3--36643631373035623733363936343635[2ff0699f-4ef0-4d57-84b3-053a760c711e]: State changed to FAILED: TarInstaller: error writing to /var/lib/elastic-agent/install/filebeat-7.9.3-linux-x86_64/NOTICE.txt: write /var/lib/elastic-agent/install/filebeat-7.9.3-linux-x86_64/NOTICE.txt: no space left on device","ecs.version":"1.5.0"}

What should we expect of Elastic Agent here? Not sure what it can do... except to purge old log files? what else can we think? and what should be shown in the Activity log, etc?

Thanks @P1llus for bringing it to us in slack

[Elastic Agent][Discuss] elastic-agent.yml overwrite needed?

Today when the Elastic Agent is enrolled into Fleet the elastic-agent.yml file is backed up and state is written into the elastic-agent.yml file that the agent is managed. In addition, the Agent writes fleet.yml with more data. On startup, the elastic-agent.yml is checked to see if the agent is enrolled. This caused some issues in Cloud initially because it was overwritten.

This issue is to have a more general discussions around the purposes of the files. Do we need to write a state to elastic-agent.yml when enrolled? Is fleet.yml enough? How do we exactly use each file. The goal is to come up with a guideline to make sure future development is aligned with this guideline.

[Agent] Configuration validations on Fleet and Agent side

We allow people to see and edit the configuration in Fleet, it might be a good idea to share validations or at least high-level validations with the expected fields like ids, type and metricset. This can be also useful for communicating with other teams.

We have created an example configuration but this is not enough to express what we are expecting. We need a formal way to define it, one way would be to use a json-schema definition that could be used by both the agent and fleet.

[elastic-agent] support resource limitations on child processes

Summary

When the elastic agent installs a new input, it starts a new process or restarts an existing process with additional input configuration. The agent does not apply any resource limits to the created subprocesses, potentially leading to the processes competing for available resources. This can become an issue when multiple processes run with high load, reaching the limit of available resources. We need a solution for limiting resource usage per subprocess.

It becomes especially important when the resources for the elastic agent are already restricted, which will be the case for the hosted elastic agent.

There is currently no concept available for how the memory/cpu shares available to the elastic agent should be distributed between processes. Most probably we would not want to limit the subprocesses by default, but only if configured. For hosted agents the orchestrator should pass a configuration to the container where the agent is running.

TODO

  • Do we need a solution for non-containerized environments which are not supporting cgroups?
  • Where should the configuration for resource limitations live?
  • Does the configuration need to be validated, e.g. sum of shares needs to be <=100% of available resources
  • how to retrieve available resources by the elastic agent, e.g. use ENV for passing in overall restrictions
  • define supported limitations (e.g. CPU period/quota)
  • do we need concrete limits, or is setting process priorities already enough?
  • privileges the elastic agent needs for applying resource limitations to the subprocesses (most probably not an issue, as it already has privileges to start the processes as root)

Update release manager with the M1 artifact

Describe the enhancement:

  • Add m1 artifact to the release manager code
  • Allow artifacts to be uploaded to the website. (require synchronization with the marketing team)
  • Add m1 artifact to be signed

Describe a specific use case for the enhancement or feature:

Relates: #151

[Elastic-Agent] [Docker] Discuss: accessing logs from different container

I'd like to ask for your recommendation for users that prefer to run Elastic Agent in the container (let's say due to security reasons).

Let's discuss the scenario:

The integrated product is nginx running in a container. It produces logs stored locally in the image and which are rotated. As the agent is running in a different container, it can't simply access produced logs.

What is your recommendation in this particular case? Should the user expose somehow log files? Mirror them?

Background -
I had an interesting talk with @ycombinator about possibilites and testing scenarios and it looks that we will both have to nail this problem (force agent to watch logs produced in a different container).

[Elastic Agent] Remove agent.type field so it doesn't leak Beats details

The agent.type used by Elastic Agent currently leaks details about the underlying Beats. With my 7.13 agent, it's set to "filebeat" for logs and "metricbeat" for metrics. We don't want to create a user dependency on this Beats information because it may be refactored out in the future.

The solution for now is to not populate it and later add it. The reason for this is that not in all scenarios elastic-agent is the actual agent. This is true when it runs as a server (http server) and the agent.id and type could come from the sender. So leaving it out will reduce this mess for now. So we have an opportunity to clean this up. Adding the field is easy and adding it later is an addition. Remove it later is a breaking change. I think there are many other meta fields that we can likely already use for debugging.

Agent per process Metrics document standardization

When using the /stats API, the event is returned as is. When collecting stats from Beats the beats namespace does contain process metrics like cgroup, CPU, and memory usage. But the process name is not included. When metrics are queried via Agent, the namespace beats should become process, plus the field process.name should be added, which includes the process name known to agent (e.g. filebeat-default-monitoring).

Although we have no event routing availale yet, data source should be encouraged to provide a data stream meta-data as hints (which can still be ignored). When quering process stats via agent, the JSON document published by Agent should include the data stream fields.

The change could be added to libbeat, or (maybe easier) as a processing step in Agent. When done in Agent we already have a place where we can massage endpoint stats in the future.

Elastic Agent non-Windows host - Agent doesn't finish installing until after re-starting the service

Hi this is a spawn off of testing done in support of
elastic/beats#26665

  • testing done with 7.14 BC4 Agent and Cloud based stack

I'm transferring this issue from the Endpoint team, to Beats / Agent.

From @dikshachauhan-qasource : we have attempted to validate the endpoint behavior on French VM machines and found it working fine with a small glitch.

Observations:
Scenario1:
Installed agent under a policy having endpoint.

Agent remained in updating state till we manually restart the elastic-agent service.
Host then updated to healthy status and was available under Endpoint tab with status 'success'.
Data streams were working fine.
All binaries were in running state.
Recording:
https://user-images.githubusercontent.com/12970373/127567223-9c1fd3ee-4216-4837-b0a6-2d6cb45d0300.mp4

Scenario2:
Unenrolled then Re-Installed agent under same policy having endpoint.

Observations same as mentioned above.
Scenario3:
Unenrolled then Re-Installed agent under Default policy. Later after installation of agent, we added Endpoint security.

Observations same as mentioned above.
screenshot:
windows-10-french

Logs.zip:
logs-french-win-10-agent.zip

Elastic-Agent Docker: silent standard output

Hi,

I was researching different ways of enabling verbose logging in Elastic Agent in terms of elastic/elastic-package#86 . I'd like to collect Elastic-Agent and subprocesses logs at some folder (which can be mounted and exposed externally).

Then, I came up to a different conclusion: the standard log output of Elastic Agent is silent even though the application is running in background and logging data to .log files:

{"isInitialized":true}{"isInitialized":true}{"list":[{"id":"935ec4c0-5415-11eb-b36e-d53bf68e2a18","active":true,"api_key_id":"4lTA8XYBqSxScuxU6GUe","name":"Default (de7d6165-b378-4b41-a770-2b419e856d98)","policy_id":"8afbdb60-5415-11eb-b36e-d53bf68e2a18","created_at":"2021-01-11T14:02:00.844Z"}],"total":1,"page":1,"perPage":20}
935ec4c0-5415-11eb-b36e-d53bf68e2a18
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   324  100   324    0     0  11899      0 --:--:-- --:--:-- --:--:-- 12000
NGxUQThYWUJxU3hTY3V4VTZHVWU6dVlRSWN5WERSUXl2RDRDX1JoYnRfZw==
The Elastic Agent is currently in BETA and should not be used in production

Successfully enrolled the Elastic Agent.
Elastic Agent might not be running; unable to trigger restart

(no more log messages)

This behavior doesn't seem to be consistent with other dockerized apps like Kibana or Elasticsearch which print levelled log messages by default. What do you think about changing this behavior to a similar pattern? I would appreciate if there is a special combined logging mode, which can merge log outputs from multiple sources like Elastic Agent, Metricbeat, Filebeat, etc.

[Elastic Agent] Report running processes and their health statuses

This is related to elastic/kibana#75236 and elastic/kibana#99068, both of which are longer-term efforts around enabling more granular status reporting of "integrations" that are running on Elastic Agent. But Agent has no concept of integrations, only which inputs/processes are running.

Still, reporting that information is useful and would get us closer to our longer-term goals. In the short term, this would enable Endpoint to filter agents by which ones are running Endpoint without doing additional JOIN-like queries.

I'd like to propose that agents:

  • Report what inputs/processes are running
  • Report the health status of each
  • Store the above information in local_metadata field

One thing to consider in deciding the data structure of of how this information should be stored, is that in the future we will want to allow subprocesses to report their own additional meta information, such as Endpoint process reporting an "isolated" status.

[Agent] Reporting failures

Original comment by @michalpristas:

At the moment the process is as follows using grpc

  1. Beat uses manager.Register call to register settings it know how to handle.
  2. configuration is read/detected by agent which processes it and send to specific beat
  3. Beat retrieves configuration via Config(string) endpoint and tries to parse it
  4. Beat sends parsed configuration (in form of map[string]iface{}) to a fleet/manager using a channel
  5. Beats fleet/manager, breaks configuration into configuration blocks it understands (based on CM)
  6. Configuration blocks are applied using existing Central Management mechanism

When failure occurs in step 1, 2 and 3 it is returned to a caller as an error

But when error occurs in step 4 or 5 agent is not aware of failure (unless beat crashes, then it tries to restart it and apply config again)

We need to think of a way how to propagate failures from

  • beat to agent
  • agent to upstream

We also need to think about pairing experienced failure with a concrete configuration version (this is more or less a question for @mattapperson).
At this moment beat does not have a concept of a configuration version at all nor agent propagates version down the stack.

cc @ph @urso

[Elastic Agent] Support Kafka as an output for elastic agent

Describe the enhancement:
Bringing the Elastic Agent more in line with outputs supported by Beats.

Describe a specific use case for the enhancement or feature:

Enable customers who are using beats to send events/logs to a Kafka broker to be able to create the same environment and functionality using the Elastic Agent. Lack of this capability may be an inhibitor for the adoption of Elastic Agent.

[Elastic Agent] Failed Endpoint installation retry loop

In situations where the Elastic Endpoint Security integration installation fails to successfully install, Agent appears to continuously retry the installation. It's not clear whether there is a limitation or cap on the retries, but there does not appear to be. This results in unnecessary resource utilization, including filling up the elastic-agent.log file.

Details:

  • Version: 7.12.1 (at least)
  • Operating System: All version of Windows, untested on macOS or Linux.
  • Steps to Reproduce: Please reach out directly for logs and steps to reproduce the failure.

Elastic-Agent: Do not output to STDERR under powershell, unless you want PS to fail execution as an error

  • What: elastic-agent.exe
  • Version: 7.10.0 (but all previous too)
  • OS: Windows 10 (as of 2020-11-24)

Problem:

In certain execution contexts PowerShell will convert any line of text sent to STDERR into an Error object. This will no doubt go unhandled thus the commend is failed by powershell:

 PS C:\Users\Administrator\Documents\EC_Spout> C:\Users\Administrator\Documents\EC_Spout\agent_install+enroll.ps1
Uninstalling existing
Elastic Agent has been uninstalled.
The Elastic Agent is currently in BETA and should not be used in production

elastic-agent.exe : 2020-11-24T07:32:24.902-0800	DEBUG	kibana/client.go:170	Request method: POST, path: 
/api/fleet/agents/enroll
At C:\Users\Administrator\Documents\EC_Spout\agent_install+enroll.ps1:91 char:1
+ & "$download_dir\elastic-agent-$stack_ver-windows-x86_64\elastic-agen ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (2020-11-24T07:3...t/agents/enroll:String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError 

Google "PS NativeCommandError" to discover all the happy people that trip over this error.

Solution:
On Windows systems (especially under powershell) do not send anything to STDERR unless its really is an error, and the command should be terminated/failed.

Ways to reproduce:
In an Interactive PS, this error handling will most likely not be enabled. Under ISE is most often is.

Open PowerShell ISE and write a ps1 script:

 & "$download_dir\elastic-agent-7.10.0-windows-x86_64\elastic-agent.exe" install -f -k "$kn_url" -t "$agent_token" 

Run the script with the 'play' button in the toolbar (after saving it).

Doing it via ISE like this was the easiest way, I think, to have PS in such an error handling mode. I have experienced the same problem with PS scripts start by the task scheduler.

Extra info:
I maintain scripts to automate starting a demo env.: https://github.com/ElasticSA/ec_spout (more info for Elastic employees here: https://wiki.elastic.co/display/PRES/EC+Spout )

Improve handling of Agent Install / Enroll if already (previously) installed

Describe the feature:

When you run the enrolment command on Elastic Agent on a host where it has already been installed it terminates and you get the following error (on Mac at least)
"Error: already installed at: /Library/Elastic/Agent"
To continue, you then need to work out how to uninstall agent and then re-run the command.
It is likely with people doing initial testing will try and enrol a test host in more than one cluster as they iterate dev/poc clusters so it would be useful if Agent handled the situation better.

The ideal scenario is Agent would ask if you want to change the configuration of the installed agent to enrol in the new cluster. Alternatively, it could ask for confirmation and then uninstall the existing agent for the user.

As a fallback, it could at least provide the full uninstall command to the sure to be able to continue.

Describe a specific use case for the feature:

Setup of Elastic Agent

[Ingest Manager] elastic-agent process is not properly terminated after restart

Environment

Steps to Reproduce

  1. Start a Centos:7 docker container: docker run --name centos centos:7 tail -f /dev/null
  2. Enter the container: docker exec -ti centos bash
  3. Download the agent RPM package: curl https://snapshots.elastic.co/8.0.0-3ce083a1/downloads/beats/elastic-agent/elastic-agent-8.0.0-SNAPSHOT-x86_64.rpm -o /elastic-agent-8.0.0-SNAPSHOT-x86_64.rpm
  4. Install systemctl replacement for Docker: curl https://raw.githubusercontent.com/gdraheim/docker-systemctl-replacement/master/files/docker/systemctl.py -o /usr/bin/systemctl
  5. Install the RPM package with yum: yum localinstall /elastic-agent-8.0.0-SNAPSHOT-x86_64.rpm -y
  6. Enable service: systemctl enable elastic-agent
  7. Start service: systemctl start elastic-agent
  8. Check processes: top. There should be only one process for the elastic-agent
  9. Restart service: systemctl restart elastic-agent
  10. Check processes: top

Behaviours:

Expected behaviour

After the initial restart, the elastic-agent appears once, not in the Z state.
Screenshot 2020-08-11 at 17 16 34

Current behaviour

After the initial restart, the elastic-agent appears twice, one in the Z state, and the other in the S state (as shown in the attachment)
Screenshot 2020-08-11 at 17 15 38

Other observations

This behavior persists across multiple restarts: the elastic-agent process gets into the zombie state each time is restarted (note that I restarted it three times, so there are 3 zombie processes):
Screenshot 2020-08-11 at 17 18 22

One shot script

docker run -d --name centos centos:7 tail -f /dev/null
docker exec -ti centos bash

Inside the container

curl https://snapshots.elastic.co/8.0.0-3ce083a1/downloads/beats/elastic-agent/elastic-agent-8.0.0-SNAPSHOT-x86_64.rpm -o /elastic-agent-8.0.0-SNAPSHOT-x86_64.rpm
curl https://raw.githubusercontent.com/gdraheim/docker-systemctl-replacement/master/files/docker/systemctl.py -o /usr/bin/systemctl 
yum localinstall /elastic-agent-8.0.0-SNAPSHOT-x86_64.rpm -y
systemctl enable elastic-agent
systemctl start elastic-agent
systemctl restart elastic-agent
top

[Elastic Agent][Discuss] elastic-agent.yml overwrite needed?

Today when the Elastic Agent is enrolled into Fleet the elastic-agent.yml file is backed up and state is written into the elastic-agent.yml file that the agent is managed. In addition, the Agent writes fleet.yml with more data. On startup, the elastic-agent.yml is checked to see if the agent is enrolled. This caused some issues in Cloud initially because it was overwritten.

This issue is to have a more general discussions around the purposes of the files. Do we need to write a state to elastic-agent.yml when enrolled? Is fleet.yml enough? How do we exactly use each file. The goal is to come up with a guideline to make sure future development is aligned with this guideline.

Add a way to unenroll an Elastic Agent from the client side

Describe the enhancement:

Currently there's no way to unenroll an elastic-agent from the client side

Describe a specific use case for the enhancement or feature:

When running ephemeral instances (containers, for example) each can enroll, but when the container is stopped we end up with stranded offline instances in fleet, which then takes two commands per host on the Fleet screen (unenroll and force unenroll, because they never unenroll), for a total of 6 clicks, plus delays, for each host.

If there were an unenroll subcommand for ./elastic-agent it could be called in the container teardown

image

[Elastic Agent][Discuss] elastic-agent.yml overwrite needed?

Today when the Elastic Agent is enrolled into Fleet the elastic-agent.yml file is backed up and state is written into the elastic-agent.yml file that the agent is managed. In addition, the Agent writes fleet.yml with more data. On startup, the elastic-agent.yml is checked to see if the agent is enrolled. This caused some issues in Cloud initially because it was overwritten.

This issue is to have a more general discussions around the purposes of the files. Do we need to write a state to elastic-agent.yml when enrolled? Is fleet.yml enough? How do we exactly use each file. The goal is to come up with a guideline to make sure future development is aligned with this guideline.

Retrieving integration configuration for auto-discovered inputs on kubernetes

Describe the enhancement:
With beats, the configuration was available to metricbeat and filebeat locally on host but with agent and packages we moved the configuration definition in package registry. So when Kubernetest discovers that pod runs a software that we monitor, either through dynamic inputs conditionals in agent config or via hints based discovery, agent needs to download the integration config for the auto-discovered software. When we deliver this enhancement, agent will automagically ship data to right datastreams in elasticsearch similar to how beats do today. The user still needs to install the right package in kibana. We will tackle the auto-installation of package in separate issue.

Example scenario:
Worker node is running nginx on pod and through dynamic inputs or hints based auto-discovery, agent detects the existence of nginx running on that worker node. Agent is able to retrieve the nginx configuration for metrics and logs, fill in the values provided in auto-discovery configuration -( e.g see configuration examples in metricbeat here and is able to ship data to elasticsearch successfully.

[Elastic Agent] Improve error logging (reduce to Info Level where possible) wrt changing logging level to debug

  • Version: 7.13.0-SNAPSHOT
  • Environment: Docker

The Elastic Agent was running for a few minutes and I changed the logging level in the Fleet UI from Info to Debug. This all seems to have worked but the first we lines that were logged, looked as following:

2021-05-03T19:26:23.569Z	INFO	process/app.go:176	Signaling application to stop because of shutdown: metricbeat--7.13.0-SNAPSHOT
2021-05-03T19:26:24.066Z	INFO	log/reporter.go:40	2021-05-03T19:26:24Z - message: Application: filebeat--7.13.0-SNAPSHOT[4f12dd1d-f096-40b1-8bf4-8a0e66722775]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
2021-05-03T19:26:24.066Z	INFO	log/reporter.go:40	2021-05-03T19:26:24Z - message: Application: filebeat--7.13.0-SNAPSHOT--36643631373035623733363936343635[4f12dd1d-f096-40b1-8bf4-8a0e66722775]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
2021-05-03T19:26:24.066Z	INFO	log/reporter.go:40	2021-05-03T19:26:24Z - message: Application: metricbeat--7.13.0-SNAPSHOT--36643631373035623733363936343635[4f12dd1d-f096-40b1-8bf4-8a0e66722775]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
2021-05-03T19:26:24.066Z	INFO	log/reporter.go:40	2021-05-03T19:26:24Z - message: Application: metricbeat--7.13.0-SNAPSHOT[4f12dd1d-f096-40b1-8bf4-8a0e66722775]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
2021-05-03T19:26:24.999Z	ERROR	fleet/fleet_gateway.go:167	context canceled
2021-05-03T19:26:26.378Z	ERROR	fleet/fleet_gateway.go:167	context canceled
2021-05-03T19:26:27.852Z	INFO	fleet/fleet_gateway.go:298	Fleet gateway is stopping
2021-05-03T19:26:27.852Z	INFO	status/reporter.go:236	Elastic Agent status changed to: 'online'

I stumbled over the two ERROR log entries related to context which also contain very little "context" what it is about.

[Elastic Agent] Add ability to debug a composable provider

Overview

To help support dynamic inputs elastic/beats#19225 Elastic Agent needs to add the ability to debug the providers using for variable substitution. This issue is to track the debugging effort, for information about variable substitution review elastic/beats#20781

Debugging

This obviously adds a lot of confusion to what the resulting configuration that Elastic Agent will be running with. To ensure that the feature is deployed correctly and that providers are working as expected debugging needs to be a top priority in the implementation.

Debugging the running daemon

With the new ability to communicate with the running daemon the inspect command should be changed to talk to the running daemon and return the current configuration that is being used in memory. This will ensure that with running providers like Docker and Kubernetes it is easy to inspect what the resulting configuration is.

The current inspect and output commands can be combined and moved under the debug subcommand. (Note: This is not connecting to the currently running Elastic Agent)

$ ./elastic-agent debug config

Possible to watch the configuration as changes come in with --watch.

$ ./elastic-agent debug config --watch

Debugging a single provider

A new debug command should be implemented that runs a single provider and output what it's currently providing back to the Elastic Agent. (Note: This is not connecting to the currently running Elastic Agent)

Example outputting docker provider inventory key/value mappings:

$ ./elastic-agent debug provider docker 
{"id": "1",  "mapping": {"id": "1", "paths": {"log": "/var/log/containers/1.log"}}, "processors": {"add_fields": {"container.name": "my-container"}},}
{"id": "2", "mapping": {"id": "2", "paths": {"log": "/var/log/containers/2.log"}}, "processors": {"add_fields": {"container.name": "other-container"}},}
{"id": "2", "mapping": nil}

Example rendering configurations with changes:

$ ./elastic-agent debug provider docker -c testing-conf.yml
# {"id": "1",  "mapping": {"id": "1", "paths": {"log": "/var/log/containers/1.log"}}, "processors": {"add_fields": {"container.name": "my-container"}}}
inputs:
  - type: logfile
    path: /var/log/containers/1.log
    processors:
      - add_fields:
          container.name: my-container
# {"id": "2", "mapping": {"id": "2", "paths": {"log": "/var/log/containers/2.log"}}, "processors": {"add_fields": {"container.name": "other-container"}}}
inputs:
  - type: logfile
    path: /var/log/containers/1.log
    processors:
      - add_fields:
          container.name: my-container
  - type: logfile
    path: /var/log/containers/2.log
    processors:
      - add_fields:
          container.name: other-container
# {"id": "2", "mapping": nil}
inputs:
  - type: logfile
    path: /var/log/containers/1.log

[Discuss][Elastic Agent] Deprecation logs for Elastic Agent

Elastic Stack products will start to ship deprecation logs to a specific index based on the new indexing strategy. Elastic Agent should do the same and ship deprecation logs to logs-deprecation.elastic.agent-*. It should be also discussed how and where the deprecation logs of the processes are shipped.

[Elastic Agent] Debug fleet-server connectivity

The Elastic Agent must connect to the fleet-server for enrollement. There are several issues that can happen around the connectivity to fleet-server. If the enrollment doesn't work, it would be nice to have a command line tool to investigate on what the actual issue is. Things like: certificate issue, port not open, host not reachable, wrong token etc.

This idea was triggered by issues like this one: elastic/fleet-server#235 (comment)

Implement a sub-command to show or follow the Elastic Agent logs

Debugging Elastic Agent is currently not as easy as it should be. In case of issue, the right paths for the logs have to be found and read one by one. It would be very convenient if elastic agent would offer a command to get the logs and metrics.

To tail all the logs, something like the following would be useful:

elastic-agent logs -f

Maybe later support for filtering logs from only a specific process could be added. One step further would be, that on the fly the logging level could be changed.

The same is true for metrics. Would be nice of a snapshot of the metrics could be gathered with something like the following:

elastic-agent metrics

[Elastic Agent] Improve "when" handling in the program specification.

When we created the "when" clause we were under the impression that all the beats were actually equal and supported all the same outputs. This was not completely true, APM-Server supports a subset of the output that beats supports, they do not support redis.

Maybe we should just move to the conditions and rely on capabilities

We should improve the reporting if an output is used and not supported by a running process, currently it will fail silently.

[Fleet] Improve status reporting for Agents

Description

Currently we report a very basic status during checkin.
To allow us to give users more details on the status of their agents we want to send more complete policy status (Format is defined here elastic/kibana#82298)

The status will be send during agent checkin:

  • This will allow Fleet Kibana (and Fleet Server in the future) to update the agent saved object and allow user to search per agent status, input status
  • A future coordinator process to take actions after a status change

Questions

How we persist status in ES?

  • I was thinking of updating the current agent SO with the actual status

Open question should the agent also send that data to ES directly?
Is this already the case if status change are in the agent logs? if yes are this log data will be searchable

Pro:

  • Allow to have agent status for agent non managed per Fleet
  • Allow to have historical data of agent status.
  • Have agent status when the connection Agent -> fleet is broken

@blakerouse @ruflin I am curious to have your thoughts here on how this can work with the future Fleet Server too

[Elastic Agent] No ES output validation can lead to stopped sub processes and no logs

The Fleet settings UI allows for setting all kind of values in the Elasticsearch output configuration. There is no validation, allowing for any kind of input.

Observed behavior:

  • Setting enabled: false: the Elastic Agent kills the sub processes and only logs that it stoped the sub processes. No indicator why they are stopped, no additional entries in the sub process logs.
  • Using wrong type, e.g. bulk_max_size: "4s": Elastic Agent keeps restarting the sub processes, which can't start because they all have an invalid configuration. Subprocess logs include detailed information - helpful for debugging

Expected behavior:

  • the logs should always contain helpful information - if the Agent stops the sub processes, a reason should be logged.

Ideally there would be some validation preventing to store invalid configurations

  • only supported options could be allowed
  • loose validation - (most important) supported options are validated, other options are passed on but not validated

I wasn't certain whether this belongs here and/or to Fleet.

[Elastic Agent] Improve error logging (reduce to Info Level where possible) wrt changing logging level to debug

  • Version: 7.13.0-SNAPSHOT
  • Environment: Docker

The Elastic Agent was running for a few minutes and I changed the logging level in the Fleet UI from Info to Debug. This all seems to have worked but the first we lines that were logged, looked as following:

2021-05-03T19:26:23.569Z	INFO	process/app.go:176	Signaling application to stop because of shutdown: metricbeat--7.13.0-SNAPSHOT
2021-05-03T19:26:24.066Z	INFO	log/reporter.go:40	2021-05-03T19:26:24Z - message: Application: filebeat--7.13.0-SNAPSHOT[4f12dd1d-f096-40b1-8bf4-8a0e66722775]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
2021-05-03T19:26:24.066Z	INFO	log/reporter.go:40	2021-05-03T19:26:24Z - message: Application: filebeat--7.13.0-SNAPSHOT--36643631373035623733363936343635[4f12dd1d-f096-40b1-8bf4-8a0e66722775]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
2021-05-03T19:26:24.066Z	INFO	log/reporter.go:40	2021-05-03T19:26:24Z - message: Application: metricbeat--7.13.0-SNAPSHOT--36643631373035623733363936343635[4f12dd1d-f096-40b1-8bf4-8a0e66722775]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
2021-05-03T19:26:24.066Z	INFO	log/reporter.go:40	2021-05-03T19:26:24Z - message: Application: metricbeat--7.13.0-SNAPSHOT[4f12dd1d-f096-40b1-8bf4-8a0e66722775]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
2021-05-03T19:26:24.999Z	ERROR	fleet/fleet_gateway.go:167	context canceled
2021-05-03T19:26:26.378Z	ERROR	fleet/fleet_gateway.go:167	context canceled
2021-05-03T19:26:27.852Z	INFO	fleet/fleet_gateway.go:298	Fleet gateway is stopping
2021-05-03T19:26:27.852Z	INFO	status/reporter.go:236	Elastic Agent status changed to: 'online'

I stumbled over the two ERROR log entries related to context which also contain very little "context" what it is about.

[Agent] Testing & Support running of Elastic Agent from a network shared drive

Describe the enhancement:
Users may want to protect and observe their network shared drives, so we could support it.

It is currently not recommended to use. We have no automated tests to verify it works and have anecdotal (but old) data that indicates (at least) Filebeat would have a problem running there. no specifics further available at the time (testing would be required to generate example errors seen, etc).

Will leave this logged as an enh for now, and will add a brief note tied to this in the obs-docs for Agent.

[Elastic Agent] Overwrite global processor settings

At the moment the logs of Filebeat started by the Agent is polluted with the debug logs of the add_docker_metadata processor. Example log line:

{"log.level":"error","@timestamp":"2020-07-23T16:00:00.372+0200","log.logger":"add_docker_metadata.docker","log.origin":{"file.name":"docker/watcher.go","file.line":320},"message":"Error watching for docker events: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?","ecs.version":"1.5.0"}

As I am not trying to test anything Docker related, these lines are distracting. Normally I just remove all processors when debugging Filebeat issues. However, as the Agent manages the configuration of these Filebeat instances, I do not have access to these global processors. Also, the inspect subcommand does not show if these processors are included or not. (Or at least I am not aware of any flags which could show it.) But based on the logs, these processors are enabled.

$ ./elastic-agent inspect output -o default -p filebeat
Action ID: 856251a6-a6f8-40e9-b71b-af54738c3280
[default] filebeat:
filebeat:
  inputs:
  - exclude_files:
    - .gz$
    id: logfile-postgresql.log
    index: logs-postgresql.log-default
    meta:
      package:
        name: postgresql
        version: 0.1.0
    multiline:
      match: after
      negate: true
      pattern: '^\d{4}-\d{2}-\d{2} '
    name: postgresql-1
    paths:
    - /home/n/go/src/github.com/elastic/beats/filebeat/module/postgresql/log/test/*.log
    processors:
    - add_fields:
        fields:
          ecs:
            version: 1.5.0
        target: ""
    - add_fields:
        fields:
          name: postgresql.log
          namespace: default
          type: logs
        target: dataset
    - add_fields:
        fields:
          dataset: postgresql.log
        target: event
    type: log
output:
  elasticsearch:
    api_key: {{key}}
    hosts:
    - {{outputhost}}

This not only impacts developers but users as well, as the logs of their agent managed Beat instances will be full of these docker/aws/etc. processor logs even if those are irrelevant.

Elastic-agent deletes tarballs on `run`

For confirmed bugs, please report:

  • Version: 8.0.0
  • Operating System: Fedora Linux using the tarball install
  • Discuss Forum URL:
  • Steps to Reproduce:
    • Download and build master
    • in x-pack/elastic-agent run mage package
    • unpack the tarball
    • Run ./elastic-agent enroll using the key you get from the Kibana UI
    • Run ./elastic-agent run

After that, I'm seeing this error:

Application: metricbeat--8.0.0[a3d097d2-59a5-4517-a5cc-86c906ac71c2]: State changed to FAILED: 2 errors occurred: * package '/home/alexk/go/src/github.com/elastic/beats/x-pack/elastic-agent/build/distributions/elastic-agent-8.0.0-linux-x86_64/data/elastic-agent-66d393/downloads/metricbeat-8.0.0-linux-x86_64.tar.gz' not found: open /home/alexk/go/src/github.com/elastic/beats/x-pack/elastic-agent/build/distributions/elastic-agent-8.0.0-linux-x86_64/data/elastic-agent-66d393/downloads/metricbeat-8.0.0-linux-x86_64.tar.gz: no such file or directory * call to 'https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-8.0.0-linux-x86_64.tar.gz' returned unsuccessful status code: 404: /go/src/github.com/elastic/beats/x-pack/elastic-agent/pkg/artifact/download/http/downloader.go[142]: unknown error 

The tarballs the come packaged in data/elastic-agent-HASH/downloads/ are being deleted. The files are there when I unpack the elastic-agent tarball, and after, at least one has been removed. I also discovered that if you try and copy over a new tarball to the downloads/ directory while elastic-agent is running, it'll instantly delete it. Sometimes it's metricbeat, sometimes it's filebeat. I'm seeing it with install as well as enroll. Manually unpacking the tarballs and putting them in install/ doesn't help.

[elastic-agent] Discuss: Default port for fleet-server when no port defined

The default port for fleet-server is 8220. When enrolling an Elastic Agent with --url=http://localhost the port 8220 is picked by default. The same is the case if https is used. On Cloud, fleet-server is exposed on 443/9243. If in the UI or during enrollment, if not port is specified it does not work by default.

I want to discuss if this is the expected behaviour or if we should default to 80/443 if no port is specified.

[elastic-agent] Support a "Shutdown" or "Stop" command in addition to Restart

Currently I am developing a non-interactive installation of the elastic agent for multiple platforms and would like to be able to run "elastic-agent run ...args..." from my application. I am able to do so but since it is not installed as a service I can only send Terminate signals in windows to shut it down (because windows, of course doesn't have proper signal handling). (Unix/Macos I can just send an HUP so this isn't as much of an issue). The elastic-agent control proto only has a "Restart" command but not a "Stop" command or I would use the elastic-agent-client to gracefully shut the application. If I to terminate it on windows the child processes (filebeats, etc) are not cleaned up and what's worse they just grow endlessly before crashing (~32GB and counting). Alternately, if the application would cleanup when receiving a windows terminate event (sans /F) that would also be perfect.

[Elastic Agent] Endpoint for metrics/healthcheck

Describe the enhancement:

All the beats have a setting to start an endpoint where you can check the stats, these are useful to monitor the internal state of the Beat. This feature can export an HTTP port, an unix socket, or a named pipe.

Scenario: Listen for basic request
Given An Elastic Agent enrolled in a Kibana instance
And you start the Elastic Agent with the option -E http.enabled=true
And a host or IP is set to listen on -E http.host=localhost
And a port is set to listen on -E http.port=5066
When a user makes the requests curl -XGET http://localhost: 5066/?pretty
Then* the Elastic Agents response with its basic info in JSON format

{
  "beat": "elastic-agent",
  "hostname": "example.lan",
  "name": "example.lan",
  "uuid": "34f6c6e1-45a8-4b12-9125-11b3e6e89866",
  "version": "7.10.0"
}

Scenario: Listen for basic info request
Given An Elastic Agent enrolled in a Kibana instance
And you start the Elastic Agent with the option -E http.enabled=true
And a unix socket is set to listen on -E http.host=unix:///tmp/elastic-agent.sock
When a user makes the requests curl -XGET --unix-socket 'unix:///tmp/elastic-agent.sock/?pretty'
Then the Elastic Agents response with its basic info in JSON format

{
  "beat": "elastic-agent",
  "hostname": "example.lan",
  "name": "example.lan",
  "uuid": "34f6c6e1-45a8-4b12-9125-11b3e6e89866",
  "version": "7.10.0"
}

Scenario: Listen for stats request
Given An Elastic Agent enrolled in a Kibana instance
And you start the Elastic Agent with the option -E http.enabled=true
And a host or IP is set to listen on -E http.host=localhost
And a port is set to listen on -E http.port=5066
When a user makes the requests curl -XGET 'http://localhost:5066/stats?pretty'
Then the Elastic Agents response with its stats in JSON format

{
  "beat": {
    "cpu": {
      "system": {
        "ticks": 1710,
        "time": {
          "ms": 1712
        }
      },
      "total": {
        "ticks": 3420,
        "time": {
          "ms": 3424
        },
        "value": 3420
      },
      "user": {
        "ticks": 1710,
        "time": {
          "ms": 1712
        }
      }
    },
    "info": {
      "ephemeral_id": "ab4287c4-d907-4d9d-b074-d8c3cec4a577",
      "uptime": {
        "ms": 195547
      }
    },
    "memstats": {
      "gc_next": 17855152,
      "memory_alloc": 9433384,
      "memory_total": 492478864,
      "rss": 50405376
    },
    "runtime": {
      "goroutines": 22
    }
  },
  "libbeat": {
    "config": {
      "module": {
        "running": 0,
        "starts": 0,
        "stops": 0
      },
      "scans": 1,
      "reloads": 1
    },
    "output": {
      "events": {
        "acked": 0,
        "active": 0,
        "batches": 0,
        "dropped": 0,
        "duplicates": 0,
        "failed": 0,
        "total": 0
      },
      "read": {
        "bytes": 0,
        "errors": 0
      },
      "type": "elasticsearch",
      "write": {
        "bytes": 0,
        "errors": 0
      }
    },
    "pipeline": {
      "clients": 6,
      "events": {
        "active": 716,
        "dropped": 0,
        "failed": 0,
        "filtered": 0,
        "published": 716,
        "retry": 278,
        "total": 716
      },
      "queue": {
        "acked": 0
      }
    }
  },
  "system": {
    "cpu": {
      "cores": 4
    },
    "load": {
      "1": 2.22,
      "15": 1.8,
      "5": 1.74,
      "norm": {
        "1": 0.555,
        "15": 0.45,
        "5": 0.435
      }
    }
  }
}

Scenario: Listen for stats request
Given An Elastic Agent enrolled in a Kibana instance
And you start the Elastic Agent with the option -E http.enabled=true
And a unix socket is set to listen on -E http.host=unix:///tmp/elastic-agent.sock
When a user makes the requests curl -XGET --unix-socket 'unix:///tmp/elastic-agent.sock/stats/?pretty'
Then the Elastic Agents response with its stats in JSON format

{
  "beat": {
    "cpu": {
      "system": {
        "ticks": 1710,
        "time": {
          "ms": 1712
        }
      },
      "total": {
        "ticks": 3420,
        "time": {
          "ms": 3424
        },
        "value": 3420
      },
      "user": {
        "ticks": 1710,
        "time": {
          "ms": 1712
        }
      }
    },
    "info": {
      "ephemeral_id": "ab4287c4-d907-4d9d-b074-d8c3cec4a577",
      "uptime": {
        "ms": 195547
      }
    },
    "memstats": {
      "gc_next": 17855152,
      "memory_alloc": 9433384,
      "memory_total": 492478864,
      "rss": 50405376
    },
    "runtime": {
      "goroutines": 22
    }
  },
  "libbeat": {
    "config": {
      "module": {
        "running": 0,
        "starts": 0,
        "stops": 0
      },
      "scans": 1,
      "reloads": 1
    },
    "output": {
      "events": {
        "acked": 0,
        "active": 0,
        "batches": 0,
        "dropped": 0,
        "duplicates": 0,
        "failed": 0,
        "total": 0
      },
      "read": {
        "bytes": 0,
        "errors": 0
      },
      "type": "elasticsearch",
      "write": {
        "bytes": 0,
        "errors": 0
      }
    },
    "pipeline": {
      "clients": 6,
      "events": {
        "active": 716,
        "dropped": 0,
        "failed": 0,
        "filtered": 0,
        "published": 716,
        "retry": 278,
        "total": 716
      },
      "queue": {
        "acked": 0
      }
    }
  },
  "system": {
    "cpu": {
      "cores": 4
    },
    "load": {
      "1": 2.22,
      "15": 1.8,
      "5": 1.74,
      "norm": {
        "1": 0.555,
        "15": 0.45,
        "5": 0.435
      }
    }
  }
}

https://www.elastic.co/guide/en/beats/metricbeat/current/http-endpoint.html

Describe a specific use case for the enhancement or feature:

In this endpoint, you can check the stats of the Elastic Agent, this endpoint can also be used to create a health check for Docker images.

[Meta] Mac M1 Support

We aim to support M1 chips by providing a Universal 2 binary

  • Decide how to build the binary for M1, either universal 2 binary or 2 different binaries (elastic/beats#27741 (comment))
    • Decision: build a Universal 2 binary
  • Support Universal 2 Binaries
    • Investigate if we need a special crossbuild docker images to build for M1. (elastic/golang-crossbuild#100)
    • Release Universal binaries:
      • Elastic-Agent
      • Beats
      • Fleet-server
      • APM
      • Elastic-Endpoint
  • Enable Elastic Agent & beats to compile for darwin/arm64
    • Add a new target to the existing platforms
    • Investigate how the build will work with our crossbuild images.
    • Enable Agent to be built for darwin/arm64 (#204)
    • Enable beats to be built for darwin/arm64 (elastic/beats#29582)
    • Enable Fleet Server to be built for darwin/arm64 (elastic/fleet-server#1371)
  • Enable Jenkins Pipeline for M1 workers (elastic/beats#26686)
  • Validate system integration (elastic/beats#26688)
  • Update release manager with the m1 artifact. (#213)
    • Add Elastic Agent and Beats to download pages
  • Endpoint security:
    • Validate that Endpoint security (which has a universal binary) is still operational
    • Endpoint security publishes àarch64and ùniversal artifacts
  • Consider enhancement to the documentation
    * is this step needed? or do we just claim that Agent is supported on MacOs 11.0 and that would apply to any hardware that OS is available on.

Resources:

Related issues:

[Agent] Add geo metadata to agents on enrollement

As discussed with @ph @drewpost @mostlyjason and @blakerouse it'd be useful to have the same fields present in the add_observer/geo_metadata processors available from the agent. This could be exposed via an API to drive Uptime's UI, showing which geographic regions can run uptime monitoring checks.

Furthermore, it'd be great to automatically fill this data based on cloud metadata where possible, providing sane defaults for common cloud datacenters like us-east-1a on AWS etc.

Investigate FIPS compliance using boring crypto

Beats uses crypto modules for the local keystore (store password and other credentials), TLS in the outputs, but also TLS support in some push based inputs (HTTP serer, syslog server).

The go stdlib crypto libraries are not FIPS compliant. Related: golang/go#21734

As different crypto libs might provide different ciphers and such, it possible if we could switch the used crypto library using environment variables.

Investigate:

  • Impact of using fips version of golang
  • Impact on the developer's experience
  • Impact on the testing / CI
  • Impact on our build and releases tasks
  • Can we support everything on our support matrix https://www.elastic.co/support/matrix
  • Impact on crossbuilding and older version of glibc

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.