This is related to <a class="issue-link js-issue-link" data-error-text="Failed to load

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[Elastic Agent] Report running processes and their health statuses about elastic-agent HOT 11 CLOSED

elastic commented on July 3, 2024

[Elastic Agent] Report running processes and their health statuses

from elastic-agent.

Comments (11)

elasticmachine commented on July 3, 2024

Pinging @elastic/agent (Team:Agent)

from elastic-agent.

mostlyjason commented on July 3, 2024

@kevinlog What kind of health status info do you want reported? I saw you have policy response data that seems to indicate whether its running successfully. I suppose that only covers initialization, not if the endpoint becomes unhealthy later?

from elastic-agent.

urso commented on July 3, 2024

@mostlyjason don't we already have another meta-issue regarding status reporting?

from elastic-agent.

kevinlog commented on July 3, 2024

@mostlyjason

What kind of health status info do you want reported? I saw you have policy response data that seems to indicate whether its running successfully. I suppose that only covers initialization, not if the endpoint becomes unhealthy later?

Endpoint will periodically update its Policy Response if there are meaningful events that change Endpoint's compliance with how the user configured it, so it could change during its lifecycle.

@ferullo could give more details on when this may happen.

from elastic-agent.

mostlyjason commented on July 3, 2024

@kevinlog Do we need another health status reporting mechanism if we already have policy response status? What additional use cases do you require that are not offered by the policy response status?

from elastic-agent.

kevinlog commented on July 3, 2024

@mostlyjason sorry I missed this the first time.

Do we need another health status reporting mechanism if we already have policy response status? What additional use cases do you require that are not offered by the policy response status?

I don't believe Endpoint needs another mechanism, I just think that Fleet users may want additional insight if a subprocess isn't running correctly. Policy compliance for Endpoint is big. So if that's in a "Failed" state, it would be good to bubble that up to Agent so that it can be reported in the UI. Otherwise, all Agents are "Healthy".

I think we could do this in a generic way so that Integrations have the option to ship a "Success/Failure/Warning" status to let Fleet users know something isn't right. Then they could drill down further to individual Agents or solutions to investigate further.

Let me know if that makes sense

from elastic-agent.

mostlyjason commented on July 3, 2024

++ sounds like a good idea to make policy responses a generic feature for all integrations. I haven't seen how it works currently, but conceptually it sounds good because it would provide a more structured error we could show on the agent details page, without the using having to dig through logs. It's also nice to have a uniform behavior if we don't have it already.

++ on having a failure response status put the agent into an unhealthy state so we keep our states consistent. Again, I'm not sure how that bubbles up but it sounds good conceptually.

As a general principal I think we don't expose processes to users directly, but the policy response could contain a aggregate of failures across all processes. We could show this aggregate info on the agent details page without exposing the underlying processes in the schema, which may result in a breaking change for users if we remove or change them in the future.

@jen-huang are you aligned on not exposing processes to users in the schema? How do you see this aligning with policy responses? Would it help to have a formal definition/design step for this issue?

from elastic-agent.

botelastic commented on July 3, 2024

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

from elastic-agent.

jen-huang commented on July 3, 2024

@pierrehilbert @nimarezainia Not sure if we have an appropriate meta issue that can supersede this one, so I am reopening for now but feel free to close and redirect.

from elastic-agent.

pierrehilbert commented on July 3, 2024

We have this one: https://github.com/elastic/ingest-dev/issues/1367

from elastic-agent.

jlind23 commented on July 3, 2024

Closing this as done.
cc @ycombinator

from elastic-agent.

[Elastic Agent] Report running processes and their health statuses about elastic-agent HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent