appuio / component-openshift4-logging Goto Github PK
View Code? Open in Web Editor NEWCommodore component to manage OpenShift 4 cluster logging
License: BSD 3-Clause "New" or "Revised" License
Commodore component to manage OpenShift 4 cluster logging
License: BSD 3-Clause "New" or "Revised" License
Collector pods keep being restarted.
{"_ts":"2022-10-18T12:18:16.867228372Z","_level":"0","_component":"cluster-logging-operator","_message":"clusterlogforwarder-controller error updating status","_error":{"msg":"Operation cannot be fulfilled on clusterlogforwarders.logging.openshift.io "instance": the object has been modified; please apply your changes to the latest version and try again"}}
Collector pods keeps running
https://access.redhat.com/solutions/6976455 suggests to explicitly set:
apiVersion: logging.openshift.io/v1
kind: ClusterLogging
metadata:
...
name: instance
namespace: openshift-logging
...
spec:
collection:
type: fluentd
logs:
type: fluentd
For spec.collection.logs.type
this is already done (
spec.collection.type
is straight forward.Currently we manage and deploy alert rules for the logging stack in component openshift4-monitoring, cf. https://hub.syn.tools/openshift4-monitoring/references/parameters.html#_upstreamrules_elasticsearchoperator. This doesn't make a lot of sense, since we can't reliably select the correct version of the alert rules for the logging stack in that component, as we don't exactly know which logging stack version is deployed.
In contrast, this component always knows which logging stack version is getting installed and can therefore easily select the matching set of alert rules.
Keep the alert rules management in component openshift4-monitoring
This issue provides visibility into Renovate updates and their statuses. Learn more
This repository currently has no open or pending branches.
The component fetches the upstream fluentd alerts from the master
branch of https://github.com/openshift/cluster-logging-operator, cf.
make gen-golden
)Golden tests start failing if the upstream alerts are modified on the master branch
We want deterministic compilation of components with default parameters. We should pick some stable version of the upstream repo to use in the component defaults.
The extracted alerts for the version 5.7 are missing. Check if they are still required, needs to be updated and add them specific for this version.
Path: component/extracted_alerts/release-5.7/*
N/A
Making the ClusterLogForwarder configurable in this component would be a good thing to do. With that, another part of the OCP4 logging could be configured in this component and parts of existing configuration could be easily reused (e.g. namespace).
It's possible using https://github.com/projectsyn/component-adhoc-configurations, but this object should be managed by this component.
The cluster-logging stack version 5.6 is now released, cf. https://docs.openshift.com/container-platform/4.12/logging/cluster-logging-release-notes.html
We need to add support for the new logging stack version in the component. Things to consider are:
When configuring the component to retrieve alert rules for release-5.5 or later, compiling the component fails.
This happens because the upstream rules were moved from https://raw.githubusercontent.com/openshift/cluster-logging-operator/${openshift4_logging:alerts}/files/fluentd/fluentd_prometheus_alerts.yaml
to https://raw.githubusercontent.com/openshift/cluster-logging-operator/${openshift4_logging:alerts}/files/collector/fluentd_prometheus_alerts.yaml
for release-5.5 and later (file moved to different folder).
openshift4_logging.alerts: "release-5.4"
openshift4_logging.alerts: "release-5.5"
Unknown (Non-Kapitan) Error occurred
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.11/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kapitan/dependency_manager/base.py", line 159, in fetch_http_dependency
content_type = fetch_http_source(source, cached_source_path, item_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kapitan/dependency_manager/base.py", line 207, in fetch_http_source
content, content_type = make_request(source)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kapitan/utils.py", line 478, in make_request
r.raise_for_status()
File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 953, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/openshift/cluster-logging-operator/master/files/fluentd/fluentd_prometheus_alerts.yaml
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/kapitan/targets.py", line 113, in compile_targets
fetch_dependencies(
File "/usr/local/lib/python3.11/site-packages/kapitan/dependency_manager/base.py", line 88, in fetch_dependencies
[p.get() for p in pool.imap_unordered(http_worker, http_deps.items()) if p]
File "/usr/local/lib/python3.11/site-packages/kapitan/dependency_manager/base.py", line 88, in <listcomp>
[p.get() for p in pool.imap_unordered(http_worker, http_deps.items()) if p]
File "/usr/local/lib/python3.11/multiprocessing/pool.py", line 873, in next
raise value
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/openshift/cluster-logging-operator/master/files/fluentd/fluentd_prometheus_alerts.yaml
404 Client Error: Not Found for url: https://raw.githubusercontent.com/openshift/cluster-logging-operator/master/files/fluentd/fluentd_prometheus_alerts.yaml
Successful compile
By default, the log level of the Kubelet is set to debug. This generates a lot of noise in the logs. It also results in a huge amount of data stored within logging (Elastisearch). The log level should be configurable through the configuration hierarchy. Section "3. Persistent configuration for OCP 4.6 and later" in https://access.redhat.com/solutions/4619431 explains how this can be done.
After upgrading the OpenShift Logging 5.7.0 Operator stuck on start with the error
Error: container has runAsNonRoot and image will run as root (pod: "cluster-logging-operator-6b5d9c7495-8rhqh_openshift-logging(ea92698c-ce34-48b5-b458-47aba00c469d)", container: cluster-logging-operator)`.
On the openshift-logging namespace the pod security is set to privileged
:
$ oc get ns openshift-logging -o yaml
apiVersion: v1
kind: Namespace
metadata:
labels:
...
pod-security.kubernetes.io/audit: privileged
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/warn: privileged
The OpenShift Logging Operator enforces the pod starting non-root:
spec:
template:
spec:
containers:
- name: cluster-logging-operator
...
securityContext:
allowPrivilegeEscalation: false
securityContext:
runAsNonRoot: true
The pod is actually started with the SCC privileged-higher-prio
:
oc get po cluster-logging-operator-674c877f5b-w25p9 -n openshift-logging -o yaml | grep "openshift.io/scc"
openshift.io/scc: privileged-higher-prio
Either the namespace configuration is wrong or the operator deployment on upgrades.
The Operator does not start. After removing the securityContext.allowPrivilegeEscalation
and securityContext. runAsNonRoot
the operator does startwithout issues.
The upgrade does run without the operator stuck in root / non-root conflicts.
Enable clusterLogForwarding to forward logs to 3rd party systems.
E.g. forward logs to an external Splunk server.
After removing openshift-logging
component, the following objects are left behind.
❯ oc get crd
clusterlogforwarders.logging.openshift.io 2021-05-18T07:41:41Z
clusterloggings.logging.openshift.io 2021-05-18T07:41:41Z
elasticsearches.logging.openshift.io 2021-05-18T07:41:42Z
kibanas.logging.openshift.io 2021-05-18T07:41:42Z
❯ oc get operator
NAME AGE
cluster-logging.openshift-logging 49m
elasticsearch-operator.openshift-logging 49m
openshift-logging
applications:
- ~openshift-logging
The Operator and CRDs are orphaned.
All logging related objects are removed.
We switch fetching the logging collector alert rules to use a lookup table in #72. However, future releases of the logging stack will no longer provide the collector alert rules as a YAML file, but instead as a Go constant, cf. openshift/cluster-logging-operator#1732.
We'll need to add support for handling this change to the component in some form.
Follow-up to #69.
To increase elastic storage we currently need to
If possible it would be nice if the commodore component increased the pv as well.
When installing elasticsearch-operator in the official namespace openshift-operators-redhat, Elasticsearch is not accessible. There are default network policies allow-from-other-namespaces and allow-from-same-namespace in place which allow ingress traffic and traffic within a ns. Communication between ns is not allowed. The elasticsearch-operator is installed in ns openshift-operators-redhat and needs access to elasticsearch in ns openshift-logging.
A network policy should be implemented, which should allow this communication.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.