stackabletech / opa-operator Goto Github PK
View Code? Open in Web Editor NEWA kubernetes operator for the Open Policy Agent
License: Other
A kubernetes operator for the Open Policy Agent
License: Other
Potential segfault in
localtime_r
invocations
Details | |
---|---|
Package | chrono |
Version | 0.4.19 |
URL | chronotope/chrono#499 |
Date | 2020-11-10 |
Unix-like operating systems may segfault due to dereferencing a dangling pointer in specific circumstances. This requires an environment variable to be set in a different thread than the affected functions. This may occur without the user's knowledge, notably in a third-party library.
No workarounds are known.
See advisory page for additional details.
The PR #347 implements resource limits and requests for the opa
container. The second container opa-bundle-builder
currently does not have any limits.
The opa-bundle-builder basically reads all provided OPA rego config maps and puts the content into a tar bundle. From the OPA docs this could grow to a quite a size.
Currently, the size of the data in ConfigMaps cannot exceed 1MB (etcd limit).
This can become a problem if there are many configmaps with rules / data.
The OpaBundleBuilder uses the tar crate, where it is stated that not all of the content must be explicitly in memory.
If rules are adapted dynamically, this also could put quite a load on the CPU when repacking the bundle.tar.gz (not sure we ever reach that many/big rules though).
This is done when:
For stackabletech/operator-rs#470
Reference implementation of stackabletech/operator-rs#476
Since OPA is a dependency for Druid, Kafka and Trino, this will require adaptation of integration tests.
There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.
Error type: Cannot find preset's package (github>whitesource/merge-confidence:beta)
The used version of operator-rs should be updated to include stackabletech/operator-rs#114
We would like to emit Kubernetes events for all errors, please see the epic for details stackabletech/issues#158
In the OPA server config.yaml we can activate logging decisions to the console (see https://www.openpolicyagent.org/docs/latest/configuration/#decision-logs), which in turn would be picked up by vector.
services:
- name: stackable
url: http://localhost:3030/opa/v1
bundles:
stackable:
service: stackable
resource: opa/bundle.tar.gz
persist: true
polling:
min_delay_seconds: 10
max_delay_seconds: 20
decision_logs:
console: true
I tried this when implementing logging for OPA. The problem is that this will lead to a JSON response when querying the data API like
{"decision_id": "123123123", "result": true}
Our self written Java OPA authorizers (Druid, Trino) are not able to deserialize that response via Jackson since they only expect to have the result
field set.
If we want the decision logging to be enabled, it requires to touch all opa-authorizer versions and make the Response object more stable (there can be even more fields in the response if required).
See
stackabletech/druid-opa-authorizer#72
stackabletech/trino-opa-authorizer#24
Potential segfault in the time crate
Details | |
---|---|
Package | time |
Version | 0.1.44 |
URL | time-rs/time#293 |
Date | 2020-11-18 |
Patched versions | >=0.2.23 |
Unaffected versions | =0.2.0,=0.2.1,=0.2.2,=0.2.3,=0.2.4,=0.2.5,=0.2.6 |
Unix-like operating systems may segfault due to dereferencing a dangling pointer in specific circumstances. This requires an environment variable to be set in a different thread than the affected functions. This may occur without the user's knowledge, notably in a third-party library.
The affected functions from time 0.2.7 through 0.2.22 are:
time::UtcOffset::local_offset_at
time::UtcOffset::try_local_offset_at
time::UtcOffset::current_local_offset
time::UtcOffset::try_current_local_offset
time::OffsetDateTime::now_local
time::OffsetDateTime::try_now_local
The affected functions in time 0.1 (all versions) are:
at
at_utc
Non-Unix targets (including Windows and wasm) are unaffected.
Pending a proper fix, the internal method that determines the local offset has been modified to always return None
on the affected operating systems. This has the effect of returning an Err
on the try_*
methods and UTC
on the non-try_*
methods.
Users and library authors with time in their dependency tree should perform cargo update
, which will pull in the updated, unaffected code.
Users of time 0.1 do not have a patch and should upgrade to an unaffected version: time 0.2.23 or greater or the 0.3. series.
No workarounds are known.
See advisory page for additional details.
Part of this epic stackabletech/issues#241
usage.adoc
with product specific information and link to common shared resources conceptusage.adoc
Which new version of OpenPolicyAgent should we support?
0.45.0
Additional information
Changes required
No breaking changes.
Implementation checklist
Please don't change anything in this list.
Not all of these steps are necessary for all versions.
Which new version of OpenPolicyAgent should we support?
Please specify the version, version range or version numbers to support, please also add these to the issue title
Additional information
If possible, provide a link to release notes/changelog
Changes required
Are there any upstream changes that we need to support?
e.g. new features, changed features, deprecated features etc.
Implementation checklist
Please don't change anything in this list.
Not all of these steps are necessary for all versions.
This allows for more flexibility and means we don't have to release a new operator for a new upstream version.
Update operator-rs to 0.27.1.
Fragment
wherever possibleQuantity
instead of parsing memory values yourselfProduct image selection will be tracked later on by stackabletech/issues#305 but should be pretty easy compared to the changes in this Issue
Currently our operators will not act on removed information from the CR in some/most/all cases.
One example:
HBase operator has three roles (master, regionServer, restServer). If I create a HBase server CR with a restServer component and then remove it later (entirely, not setting replicas to 0) our operator will not clean up the STS that belongs to this role.
This is done when all stale STSs (and other resources not needed anymore) are cleaned up when they are not needed anymore.
NOTE: This is part of an epic (stackabletech/issues#203) and might not apply to this operator. If that is the case please comment on this issue and just close it. This issue was created as part of a special bulk creation operation.
Most settings can be overridden today already even though they are not exposed as fields in the CRD.
This functionality however is hidden and not documented.
Please document all the possible files, CLI and Env (if applicable) that can be overridden.
You can take a look at the Druid PR: stackabletech/druid-operator#154
But please feel free to improve if you find anything.
net2
crate has been deprecated; usesocket2
instead
Details | |
---|---|
Status | unmaintained |
Package | net2 |
Version | 0.2.37 |
URL | deprecrated/net2-rs@3350e38 |
Date | 2020-05-01 |
The net2
crate has been deprecated
and users are encouraged to considered socket2
instead.
See advisory page for additional details.
Currently this operator watches resources in all namespaces.
I'd like this to be configurable so I can specify which namespace to watch.
This should be a clap
argument (which then can be provided on the command line or in an env var) called --watch-namespace
.
It is okay to only take a single namespace for now.
See stackabletech/issues#162 for the overarching epic
This is a bug, the information should only be printed after CLI stuff has been handled.
We could argue for an additional CLI flag to also print this but that's adifferent matter. The current implementation makes the printing of the CRDs useless as it includes the diagnostic information.
tempdir
crate has been deprecated; usetempfile
instead
Details | |
---|---|
Status | unmaintained |
Package | tempdir |
Version | 0.3.7 |
URL | rust-lang-deprecated/tempdir#46 |
Date | 2018-02-13 |
The tempdir
crate has been deprecated
and the functionality is merged into tempfile
.
See advisory page for additional details.
As a user of services deployed by this operator I'd like to know how to discover its connection details.
It's done when
NOTE: This ticket is part of an epic and autocreated for all our operators. It might not apply to this operator in particular, in that case please comment and close
stackabletech/documentation#86
Currently when one calls an operator with one the crd
subcommands but no actual parameter the operator will just start.
We'd like to print the help message instead.
Currently
stackable-operator crd restart
-> no help message, operator starts
stackable-operator crd restart -p
-> CRD is printed, operator exists
Intended
stackable-operator crd restart
-> help message is printed, operator exits
stackable-operator crd restart -p
-> CRD is printed, operator exists
As a user/admin, I want to make policy decisions based on user attributes such as group membership, that I want to define in LDAP. I want to use these user attributes in my RegoRules in OPA.
This is done when
(Note: This could be SSSD, but it's only one option)
Original below
A couple of points from our discussion:
We are still figuring out how we will do that exactly.
Previous issue text for context:
As a user/admin, I want to make policy decisions based on user attributes such as group membership, that I want to define in LDAP. I want to use these user attributes in my RegoRules in OPA.
OPA already has documentation on how to do this, there are 2 (4) variants:
Since not every product might support forwarding LDAP information to OPA, and since we probably have access to the LDAP server anyways, I'm inclined to go with Option 2. Between 2a, 2b and 2c I am unsure. It's worth reading the linked docs for detailed trade-offs, there's a summary table at the end as well outlining some trade-offs.
Some things to consider are:
See stackabletech/spark-operator#118 for an example
This is blocked by stackabletech/operator-rs#192
part of: stackabletech/documentation#408
example of an improved landing page: https://docs.stackable.tech/home/stable/druid/index.html
The new landing page should feature:
To split the usage guide, turn it into a section like here. For pages that exist in Druid/other operators, try to keep the same ordering as in those operators.
See stackabletech/hbase-operator#291 and stackabletech/docker-images#283 for reference.
This is part of stackabletech/issues#288
Related: #373
This should only be done once updated with templating from stackabletech/operator-templating#55.
build.rs
xml-rs is Unmaintained
Details | |
---|---|
Status | unmaintained |
Package | xml-rs |
Version | 0.8.4 |
URL | https://github.com/netvl/xml-rs/issues |
Date | 2022-01-26 |
xml-rs is a XML parser has open issues around parsing including integer
overflows / panics that may or may not be an issue with untrusted data.
Together with these open issues with Unmaintained status xml-rs
may or may not be suited to parse untrusted data.
See advisory page for additional details.
The OPA-Service is currently not scraped by the monitoring service (Prometheus).
Label OPA pods upon creation so that they are found by the monitoring service.
23.4.0
In 23.1 the bundle-builder container within the OPA DaemonSet was called opa-bundle-builder
.
In #420 it was renamed to bundle-builder
.
When upgrading a OPA cluster 23.1 -> 23.4 the opa-operator patches the DaemonSet, so we end up with both containers.
The bundle-builder
containers fails with
thread 'main' panicked at 'error binding to 0.0.0.0:3030: error creating server listener: Address already in use (os error 98)', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/warp-0 .3.3/src/server.rs:213:27
No response
No response
No response
yes
These Acceptance Criteria need to be met:
This depends on stackabletech/docker-images#6 which provides initial Docker images but might require further changes to the images.
In the OPA operator repository we build 2 docker images. The operator itself and the bundle builder.
The OPA operator then references the OPA product image (build from the docker repository) as well as the bundle builder (build from the operator repository).
Both end up in our docker.stackable.tech/stackable repository.
The OPA product image is configurable via the new ProductImage
(#385).
The bundle builder image is still hardcoded to the stackable docker repository.
This will only work offline if the mirrored repository is named exactly like the stackable repository.
OPA operator should work offline and independent of the Stackable repository.
The executable will be stored in the nexus package repository and curl`d in the OPA product image docker build.
Decided on Solution 1. Extract the bundle builder from the operator and use in the opa image.
epic: stackabletech/documentation#237
Acceptance criteria:
getting_started
exists, with an index.adoc
, a installation.adoc
and first_steps.adoc
installation.adoc
is removed.yaml
and shell snippets are in the examples/code
directory and can be executed as a script, to test the documentationtemplate_docs.sh
script and the templating_vars.yaml
file.Annotation prometheus.io/scrape: "true"
should be added to role services
Annotation prometheus.io/scrape: "true"
should be added to role services
We currently do not explain the concept of the bundle builder (except two sentences in https://docs.stackable.tech/opa/stable/implementation-notes.html).
A Rego config map is shown in the getting started guide but not further explained. This should be better documented (in combination with the bundle builder).
From the CRD Review.
ansi_term is Unmaintained
Details | |
---|---|
Status | unmaintained |
Package | ansi_term |
Version | 0.12.1 |
URL | ogham/rust-ansi-term#72 |
Date | 2021-08-18 |
The maintainer has adviced this crate is deprecated and will not
receive any maintenance.
The crate does not seem to have much dependencies and may or may not be ok to use as-is.
Last release seems to have been three years ago.
The below list has not been vetted in any way and may or may not contain alternatives;
See advisory page for additional details.
I noticed that the operator ignores the "replicas" setting on a rolegroup. That is unexpected. I know that it uses a daemonset and that therefore the replicas setting probably doesn't make sense, but maybe we should remove it in that case.
Not very high priority though I believe.
The docs tests are not awaiting an OPA update to test if the ConfigMap was loaded. This should be fixed somehow. Ideally we could know once the bundle is loaded, but it looks like we cannot know that, in that case we'd need to use a "sleep"
failure is officially deprecated/unmaintained
Details | |
---|---|
Status | unmaintained |
Package | failure |
Version | 0.1.8 |
URL | rust-lang-deprecated/failure#347 |
Date | 2020-05-02 |
The failure
crate is officially end-of-life: it has been marked as deprecated
by the former maintainer, who has announced that there will be no updates or
maintenance work on it going forward.
The following are some suggested actively developed alternatives to switch to:
See advisory page for additional details.
Data race when sending and receiving after closing a
oneshot
channel
Details | |
---|---|
Package | tokio |
Version | 0.1.22 |
URL | tokio-rs/tokio#4225 |
Date | 2021-11-16 |
Patched versions | >=1.8.4, <1.9.0,>=1.13.1 |
Unaffected versions | <0.1.14 |
If a tokio::sync::oneshot
channel is closed (via the
oneshot::Receiver::close
method), a data race may occur if the
oneshot::Sender::send
method is called while the corresponding
oneshot::Receiver
is await
ed or calling try_recv
.
When these methods are called concurrently on a closed channel, the two halves
of the channel can concurrently access a shared memory location, resulting in a
data race. This has been observed to cause memory corruption.
Note that the race only occurs when both halves of the channel are used
after the Receiver
half has called close
. Code where close
is not used, or where the
Receiver
is not await
ed and try_recv
is not called after calling close
,
is not affected.
See tokio#4225 for more details.
See advisory page for additional details.
Which new version of OpenPolicyAgent should we support?
v0.40.0
Additional information
https://github.com/open-policy-agent/opa/releases/tag/v0.40.0
Changes required
Are there any upstream changes that we need to support?
e.g. new features, changed features, deprecated features etc.
Implementation checklist
Please don't change anything in this list.
Not all of these steps are necessary for all versions.
encoding
is unmaintained
Details | |
---|---|
Status | unmaintained |
Package | encoding |
Version | 0.2.33 |
URL | lifthrasiir/rust-encoding#127 |
Date | 2021-12-05 |
Last release was on 2016-08-28. The issue inquiring as to the status of the crate has gone unanswered by the maintainer.
See advisory page for additional details.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.