This codifies some of the discussions had in the wg-policy Wednesday meetings, specifically during the OSCAL alignment project, and proposes ideas on how to embody those ideas. Many of these fell into the bucket "some operator outside the PolicyReport CR should worry about these"...so now we can worry :)
As part of policy frameworks or requirements like CSA (STAR/CCM) and things like NIST 800-53/171/190, DoD DISA STIG, or risk management frameworks like NIST RMF - policy is not applied in a vacuum. Policy is itself a control or supportive of a control requirement or implementation to manage risks and threats. Policy-as-code even more so.
For a system such as kubernetes, if one just randomly picks a set of configuration items - or better, uses a tool like CIS benchmarks - to write policy rules/checks, and reports the results, you can't determine whether you have met the requirements of the framework, or not. In practice, you can't really say much of anything about the security or compliance posture of your cluster at all other than some variety of rules are passing. To have an effective, auditable, or even understandable cluster policy implementation you need to map policy to a set of controls and vice versa.
What is unique to a declarative system like kubernetes, the desired state is expressed by API calls (ignoring workload container breakouts for now). The components (or assets) are already inventoried by definition in kube-apiserver/etcd. Thus the controls just need to be defined and these can be applied to all objects with labels or annotations.
It seems to me that to more efficiently apply, maintain, and report policy in the context of a dynamic (but API defined and enumerable) system like kubernetes, it would be fairly simple to define a CRD for a control definition and perhaps a catalog resource. This is aligned with OSCAL of course, so it makes this very NIST compatible from day one...and they publish a nice yaml and json catalog we can use as a test harness.
These CRs can be used by policy engines to map (via tags or namespaces or annotations or ... ) policy code rules to object configuration checks both at deploy time (admission control) or at runtime (drift detection).
RHACM and Compliance Operator do have controls defined in OpenControl yaml and these could very easily be adapted.
TBD: Does Argo have something like this already? Cloud Custodian? others?
If there are existing CRDs for this that we can contribute to - that's just fine with me. We don't have to reinvent or fork the wheel.
Anyway, once this catalog and the control objects exist, they can be queried for status/state and whatever other data is needed to quickly assess compliance and security (those are two different, sometimes overlapping, things). Suggested control object data (loosely adapted from NIST 800-53A guidance):
- state/behavior specifications (presumably these are tagged policy rules)
- with parameters - see #50
- differences between desired and actual state/behavior (these would be tagged PolicyReports)
- metadata to facilitate analysis and risk-based decision making
- threshold for "completeness" (ie defect or failure rate) of state/behavior tests/checks
- timeliness/frequency of policy checks
- metadata that adjusts or normalizes priority of each control requirement (and thus the risk and impact scores for a given Policy Report?)
- RACI metadata for human owners of controls and PolicyReports?
- metadata about confidentiality, integrity, and availability impact?
- failure modes?
- TTPs or IOCs eg MITRE ATTACK links?