Git Product home page Git Product logo

jetstack-secure's People

Contributors

alrs avatar charlieegan3 avatar d0ktadr3y avatar davidcollom avatar davidsbond avatar dbarranco avatar dependabot[bot] avatar hawksight avatar j-fuentes avatar jammyl avatar jetstack-bot avatar maelvls avatar mol-george avatar munnerz avatar opeyemi33171 avatar paulwilljones avatar ribbybibby avatar sgtcodfish avatar sharjeelaziz avatar tfadeyi avatar wallrj avatar weeblin avatar wwwil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jetstack-secure's Issues

Split "exporters" into "matchers" and "exporters"

The current "exporters" put together evaluation results and metadata and then render the result in a certain format.

The process of putting together results and metadata should be decoupled from rendering that "pack" into different formats. Here is where the "matcher" + "exporter" pattern comes into place.

As part of this change, the interface for the exporters needs to change to accept a Report struct containing both the result and the metadata, instead of receiving the policy manifest and a result collection as it does now.

Document the running of the agent

We need to document the following:

  • how to run the agent command (from source for now)
  • what the different values in the agent config file do

The target for this documentation is a developer. It might be good to show the using of the echo server too for testing.

Development version instead of actual version in docker image

The binary in the docker image does display the version information.

What happened

$ docker run quay.io/jetstack/preflight:v0.1.9 version -v
Preflight version:  development

Commit:
Built:
Go:

What should happen

$ docker run quay.io/jetstack/preflight:v0.1.9 version -v
Preflight version:  v0.1.9

Commit: <commit hash>
Built: <build timestamp>
Go: <Go version>

Scrape whitelisted namespaces to allow more restricted RBAC

Acceptance Criteria:

Acceptance Criteria is a set of test scenarios specific to this particular issue. It should capture functional and non-functional requirements.

  • it is possible to run preflight with permissions to scrape a single namespace without knowing the other namespaces in the cluster to exclude

Assumptions:

  • Some users will not be able to have cluster-viewer to run preflight.

Sub-tasks:

Create manifests to deploy preflight agent

In order to help users install Preflight in their clusters, we should provide a guide and some manifests to do this.

Sub-tasks:

  • Create a readme explaining the installation of the agent
  • Explain authorization requirements of datagathers, perhaps linking to updated dg docs for things like k8s rbac etc.
  • Create a deployment yaml to run the agent (1 replica)

Acceptance Criteria:

  • There is a clear process outlined to install preflight with the k8s/pods and gke dg configured.

Assumptions:

  • we're only worrying about k8s/pods and gke for now.
  • users want to run preflight in a k8s cluster

This task relates to this milestone

Limit namespaces used in k8s datagatherer

It's a request to be able to ignore pods in the kube-system namespace as these are often not under the control of the users in managed clusters.

Sub-tasks:

  • support excluding namespacs in a k8s datagatherer config

Acceptance Criteria:

  • I should be able to run the pods datagatherer without sending the pod data from the kube-system ns

Assumptions:

  • this is a good first port of call, and label selectors and other more advanced filters are not needed at this time.

This task relates to this milestone

Implement a k8s secret gatherer

We have recently implemented a generic k8s data gatherer #99. This is being used to fetch resources for the private certmanager package. To make this package ready for real use, we need to make sure that when secrets are gathered, only their metadata is sent.

Acceptance Criteria:

  • when I run a k8s/secrets.v1 datagatherer - only the metadata is sent
  • it is not possible to configure the agent to send secret data to the backend

Risks:

  • people have secret data in non k8s-secret resources

Assumptions:

  • we're not going to be running policy on secret content

This task relates to this milestone

A flag for setting the log level and more verbose log output

I am stuck with a non-working preflight agent on my cluster. Before going further (i.e., using mitmproxy to see what is going on with the HTTP request being made to plaform.jetstack.io), I wondered: is there a "debug" mode that would make the logs a bit more verbose?

By verbose, I would probably expect to see some of the request being made: see the 200 OK and so on. And maybe also some of the payload and the HTTP headers on the request and response.

The agent does not seem to have any --level or -v flag though:

% preflight agent -h
The agent will periodically gather data for the configured data
	gatherers and send it to a remote backend for evaluation

Usage:
  preflight agent [flags]
  preflight agent [command]

Available Commands:
  info        print several internal parameters of the agent

Flags:
  -c, --agent-config-file agent.yaml   Config file location, default is agent.yaml in the current working directory. (default "./agent.yaml")
      --backoff-max-time duration      Max time for retrying failed data gatherers (given as XhYmZs). (default 10m0s)
  -k, --credentials-file string        Location of the credentials file. For OAuth2 based authentication.
  -h, --help                           help for agent
      --input-path string              Input file path, if used, it will read data from a local file instead of gathering data from clusters
      --one-shot                       Runs agent a single time if true, or continously if false
      --output-path string             Output file path, if used, it will write data to a local file instead of uploading to the preflight server
  -p, --period duration                Override time between scans in the configuration file (given as XhYmZs).
      --strict                         Runs agent in strict mode. No retry attempts will be made for a missing data gatherer's data.

Use "preflight agent [command] --help" for more information about a command.

Consider removing public packages

In this commit 1bab9a1 we remove testing of packages in this repo.

It's considering removing the packages from this repository since they are no longer really relevant to the agent operation.

Index report bucket contents

Currently, Preflight will export data to a bucket in the following 'directory' structure:

clusterA:
  timestampA:
    packageA.json
    packageB.json
  timestampB:
    ...
clusterB:
  ... 

This bucket is queried when the data is loaded for display in the frontend when it is loaded buy the (closed source) preflight backend.

The issues with this format is that all the reports must be listed to find the latest (since we can't order the results from object storage).

Let's change the format to make it possible to find the latest report in constant time.

Goals:

  • minimize the complexity in the writes to object storage
  • create a structure which helps minimize the complexity in the reads of the same data
  • create a structure which allows the latest report for each cluster to be found and loaded quickly
  • avoid duplicated data in the bucket
  • never delete any data from the bucket

Non goals:

  • pagination of results for a single cluster's reports (this can happen at a later date)

Do not report rules marked as manual

In some packages, we have rules marked as manual: true.

That was an attempt to have rules that cannot be checked automatically, but require a human to check them manually (see #73).

The problem is that generated reports mark those rules as missing: true, which is true but can be confusing.

We should discuss ways to improve that.

Investigate the best way to authenticate on Azure

Right now, we use a short-lived token for the AKS data-gatherer, and storage account plus secret key for the blob storage (via environment variables).

The short-lived token is a problem for obvious reasons, we need to something that lasts longer.

The storage account and key are fine, but we would like to see if there is something like a credentials file to match what we have for GCS.

As a user I'd like the installation of the agent to be simpler

Currently there are instructions on how to deploy the Preflight agent to a cluster in the README.md. These include inline Kubernetes manifests which users can copy, edit, and deploy.

Ideally the deployment process should be even easier and more automated. This should improve the user experience, make it less likely that mistakes are introduced while copying manifests, and make casual users more likely to deploy the agent and try Preflight.

The only part of the deployment that user must edit is the agent configuration. Most of this configuration can have sensible defaults, except the user token. This means it should all be achievable with a few commands, which user can copy from the README.md and run. We probably can't quite have 'one click' installation but can reduce the required steps to a few commands for a minimal deployment.

Ideas

  • The most minimal option would be for users to just curl a file with several Kubernetes manifests, edit the config or use sed to inject a token value from an environment variable, and kubectl apply the file.

  • A tidier version of this would be to use Kustomize for deployment. Users could fetch a default config file and edit as required. Kustomize Secret/ConfigMap generation would then get this into the cluster, inject values using parches, and deploy the other default manifests. Users can make their own overlays to modify the rest of the deployment.

  • Another option is a Helm chart. This seems overkill but is how a lot of people expect to install applications into their cluster so is worth considering.

  • Manifests could be generated by the backend, so when a user signs up and a token is generated a set of manifests is also generated which they can just download and apply, or apply directly from an endpoint we provide.

Acceptance Criteria:

Acceptance Criteria is a set of test scenarios specific to this particular issue. It should capture functional and non-functional requirements.

  • There is a minimal deployment process
  • The deployment process has automated testing

Risks:

  • Poor deployment automation could make the process more painful for users
    • This can be mitigated by ensuring the deployment process is tested
  • Obfuscating the deployment manifests could put users off as they're not sure what they're putting in their cluster
    • This can be mitigated by ensuring that manifests are still visible in the repo and are well documented
  • Overcomplicating the deployment process can make it harder for developers to maintain
    • This requires us to agree a balance of automation and maintainability.

Assumptions:

  • Users want easier deployment
  • Users might make mistakes copying manifests
  • Casual users will be put off by having to copy and edit multiple YAML files.

Dependencies:

  • This will introduce a dependency whatever process we use to test the deployment automation
    • For example we may introduce kind as a dependency for testing
  • This may also introduce a deployment tool as a dependency
    • For example Kustomize or Helm

Sub-tasks:

  • Agree a deployment approach
  • Implement the deployment approach
  • Implement testing for the deployment
  • Get a user to try the deployment and give feedback

Config file schema

I feel that it would be better if we defined a Go struct for the Preflight config and un-marshalled YAML into this, rather than using viper.GetString throughout the code.

This will make it clearer what the structure of the config should be, and make it clearer when it changes. It also means that the config can be checked more before Preflight actually runs anything.

This would also allow us to add a command to check a config file, if we felt this was useful.

Allow disabling of checks in a package

We have packages of checks that preflight can run. While these are intended to be used in full - sometimes checks will need to be disabled.

Preflight yaml changes

It makes sense to me to configure this in one of the following ways:

Namespaced List of Excludes

enabled-packages:
  - "examples.jetstack.io/gke_basic"

ignored-checks:
  - "examples.jetstack.io/gke_basic/networking/private_cluster"

This is simple and likely most simple to implement but I don't like how the package names and namespaces are duplicated.

Checks listed under package

enabled-packages:
- name: "examples.jetstack.io/gke_basic"
  disabled-checks:
  - private_cluster

This appears to be more concise but requires a change in the data format for packages making it marginally more complex to implement.

Whitelist/Blacklist

enabled-packages:
- name: "examples.jetstack.io/gke_basic"
  enabled-checks: # whitelist
  - private_cluster
  disabled-checks: # blacklist
  - master_auth

Where both fields are optional - if both present, use the whitelist. This is marginally more complex to implement but makes it easier to use a single check from a package (using the whitelist).

Changes to reports

The excluded checks are not only absent from the report but are also not run. This is important to avoid running checks where the results are not important to the user.

wdyt @j-fuentes @wwwil ?

Remove all references to check

We are now almost at the point where we are ready to start onboarding public beta users to the new tool.

As part of clearing up the messaging around the product, we should make the public repo all about the agent.

  • remove the check documentation
  • remove the check command code

[agent] Implement minimal config file loading for agent

Sub-tasks

  • implement a means of loading config - much the same as preflight core
  • configure schedule variable
  • configure endpoint variable
  • configure identity token variable
  • configure loaded data gather variables

This is just to load the vars from the file - nothing else.

Users of the agent need this to be able to control how the agents works when its running in their clusters. If we one day have a generated yaml, this might form the content of a configmap to be loaded into the running agent.

Acceptance Criteria:

  • I can run the agent with different config
  • for now, the agent just prints the config and exits

Internal: Created from: https://github.com/jetstack/preflight-private/issues/260 Needs: https://github.com/jetstack/preflight-private/issues/263 Enables: https://github.com/jetstack/preflight-private/issues/265

Specify workload identity as requirement

When following the guide to install Preflight in cluster on GKE having workload identity enabled on the cluster should be listed as a requirement. Without it the second Terraform project cannot be applied.

[agent] Make agent share identity with backend

Users of the agent need to to be able to prove their ID so that they have access to all the features of the backend. When uploading data - share the identity of the agent.

Sub-tasks

  • share the configured token as a bearer token header or similar
  • handle 401 errors gracefully with a good error message

Acceptance Criteria:

  • Agent token shows in request header (which would allow a backend to determine its ID)

Notes

The use of token auth doesn't scale in the long term as we add many agents tokens to the backend configuration. In the future it'd likely we'll offload the authentication to some third party. I use Auth0 as an example.

In this version of the future agents would gain an access token from Auth0 to talk to the backend using 'machine to machine' auth. We'd create client credentials to give to installers of agents automatically, perhaps using an API endpoint like this.

Our use of simple static auth doesn't make this option harder to reach in future, it looks almost identical to the installer of the agent. It'd also be easy to run both Auth0 + static token auth at the same time if needed.


Internal: Created from: https://github.com/jetstack/preflight-private/issues/260 Pre-req: https://github.com/jetstack/preflight-private/issues/266

As an agent user, I'd like to be able to use the agent via an http proxy

This is a user request being written up as a story.

It's common in certain companies to route all traffic via an http proxy - we should support this to allow those in such envs to use preflight.

Assumptions:

  • we're talking http_proxy env var here

Sub-tasks:

  • implement means of detecting the use of http_proxy in the env
  • route via the configured proxy if it's working (e.g. if it can reach jetstack.io?)
  • if the proxy is not working, crash the agent with an error

Acceptance Criteria:

Acceptance Criteria is a set of test scenarios specific to this particular issue. It should capture functional and non-functional requirements.

  • it's possible to send data from dgs to an external backend via an http proxy

Remove the artificial prefix `preflight_` in the rego rules

Preflight assumes there is some conversion between the rule IDs in the Policy Manifest and the REGO files.

This conversion is needed because rule IDs in rego have certain limitations.

But also, we have introduced the preflight_ prefix as part of this convention. In my opinion, this is very artificial and not necessary at all. I think this was introduced in the early times of Preflight to distinguish which rules were actually Preflight rules and which ones were just helpers. This is no longer necessary, as the Policy Manifest is the source of truth for what is a Preflight rule.

Create generator for Preflight Packages

Add a command (e.g. preflight package generate) that generates the skeleton of a package so it is easier for a user to get started writing its own package.

Document gathering custom data with the local datagatherer

Currently Preflight has a set of built-in datagatherers. This is a limitation, because if someone wants to create a new package but doesn't have a datagatherer that provides the information for it, the only option is to write Go code and create a new datagatherer. This might not be ideal for some people.

Ideally, there should be a way of plug into Preflight any external data provided from an arbitrary process or file or whatever that is generic.

Support GKE regional clusters

At the moment Preflight GKE data gatherer can only be configured for a single zone cluster.

Add support for regional clusters.

Github workflows fail to tag images after push

The release-master and release-tag workflows are failing to re-tag the image in quay.io:

docker buildx imagetools create quay.io/jetstack/preflight:b2384c5add501ec97516fc906ceab6967ff868a8 --tag quay.io/jetstack/preflight:latest
error: failed commit on ref "index-sha256:fcc90749fc55b6d51f932b3178a710be08cfe89b70beb57aa1d0fbe4ef2e3c66": unexpected status: 401 Unauthorized

The image was pushed correctly with the commit tag: https://quay.io/repository/jetstack/preflight/manifest/sha256:fcc90749fc55b6d51f932b3178a710be08cfe89b70beb57aa1d0fbe4ef2e3c66

It seems to be an issue with buildkit: docker/buildx#327 open-policy-agent/gatekeeper#665

Allow to load data to mock several data gatherers from a single local file

#180 adds a nice feature to output the data gathered to a file. That file contains data from all the data-gatherers.

It would be great to have an option that allows the agent to run and use that file as the source for the data to mock all the data-gatherered.

We already have the option local in the configuration for a data-gatherer, but that loads an input file for each data-gatherer.

The feature that is described here should work by loading a file with the exact format of the file that the flag in #180 outputs.

Easier end-to-end testing

We should make it easier to do end-to-end testing of Preflight. While there are Go tests for components the only way to test the whole application currently is against a real cluster. As Preflight is designed to work with various platforms (GKE, AKS, etc.) testing support for all of them requires many clusters, which is not convenient.

Make gathering of data from many data sources more reliable

Right now, if we fail to gather data, that data is missing from the readings. This means that as we increase the number of requests made to gather data, we dramatically increase the likelihood that the whole data gather stage partially fails.

Ideally, we retry gatherers when they fail so that the agent is more reliable.

They should back off exponentially.

PR: #185

Split deployment manifests into groups

Currently, we have just a Kustomize 'base' for Preflight that contains manifests for both RBAC related resources and the CronJob, which means that it all gets applied with one kubectl apply -k ...

The problem

Typically within an organization, some administrator is going to create the Namespace, ServiceAccount, etc and then another system is going to deploy the workload (the CronJob in this case).

The way these manifests are provided at the moment makes it hard to apply them in two steps.

Make it possible to run the agent as a one-off task

During one off reviews of clusters it's helpful to run the agent once only rather than as a long running process.

To support this use case it should be possible to set a flag on the agent that means the data gathering step only happens once.

It should also not sleep if only running the process once to make the command exit as fast as possible.

Make package lint descend into subdirectories

When testing packages using the Preflight CLI tool it is possible to test multiple packages together like so:

$ preflight package test ./preflight-packages
...
2020/01/20 14:04:59 All packages tests passed :)

Here Preflight recursively descends into subdirectories to find packages. However this does not work when linting packages. For example:

$ preflight package lint preflight-packages
2020/01/20 14:08:44 Linting package preflight-packages
2020/01/20 14:08:44 Lint: preflight-packages - Unable to read manifest path
2020/01/20 14:08:44 Encountered lint errors
exit status 1

The ability to descend into subdirectories should be added to the lint command.

[agent] Create a new agent binary/command

We're going to cut back the scope of the cluster component to a more simple 'agent'.

This agent binary will have a reduced set of responsibilities:

  • gathering data on a schedule
  • uploading data for evaluation to some backend

This story is to just build the binary that we can start adding other features to afterwards.

This is being done to simplify the installation of preflight and support the remote evaluation of packages.

Acceptance Criteria:

  • It is possible to build an agent binary from the makefile in the project

Internal Use: Created from https://github.com/jetstack/preflight-private/issues/260 Enables: https://github.com/jetstack/preflight-private/issues/264

Support for testing packages

Currently, users need to use OPA's CLI to perform tests on the REGO files.

Preflight CLI should have a way to execute the tests of a package with preflight package test <package> without the user having to install OPA's CLI.

Update datagatherer documentation with example data they send

It's important that we have examples of the data sent by each data gatherer to the backend (when running the agent).

Update each datagatherer to show an example json payload - perhaps in the readme of the datagatherer.

This is useful for reviewing the data sent but also when making new packages.

Expose context information about why a check is failing

Screenshot from 2020-01-08 12-18-49

This is a screenshot from https://preflight.jetstack.io/

It shows the result of running our basic pod checks. We can see that some pods are missing requests and limits - oh dear.

This is based on the following report: https://github.com/jetstack/preflight/blob/master/examples/reports/pods.json#L7

All that's exposed in the report is the sucess/fail - not the reason.

When using OPA as something that backs a k8s webhook the idiomatic way to do this is to return a message if the rule fails, otherwise return nothing.

In our pod example, I think i'd return the pod name & namespace in a single string, one for each violation.

I guess we might also remove the success attribute and replace it with a violations list - if there are violations then it failed?

AKS data-gatherer missing some information

Currently the AKS data gatherer collects information about the configuration of an AKS cluster from the Azure API.

https://docs.microsoft.com/en-us/rest/api/aks/managedclusters/get

The information returned includes a list of node pools, referred to as agent pools. However it does not give details of each of these pools. This needs to be fetched separately. We should get the configuration of each node pool so we can make the checks performed in the AKS package more comprehensive.

https://docs.microsoft.com/en-us/rest/api/aks/agentpools/get

These will both return separate JSON documents, in fact there will be a JSON document for each node pool. We need to work out how this will be handled in Preflight. We could put them all in a list in master JSON document to evaluate with Rego. Alternatively we could make an AKS node pool data gatherer separately, but this would require support for multiple instances of the same data gatherer type to fetch multiple node pools and seems like more work for users.

I had misunderstood the problem here. Using the az aks show --resource-group preflight --name preflight-test-wil command I can see all the required information, as described in the API spec: https://docs.microsoft.com/en-us/rest/api/aks/managedclusters/get

However when using HTTP GET requests, as the data gatherer does, some information is missing. This also occurs when doing the same thing manually with the curl command.

Related to #30

[agent] Implement data upload from agent

Agents are not able to evaluate packages, only gather data. They need to be able to upload to a backend.

Needs: https://github.com/jetstack/preflight-private/issues/265 (or it can upload can upload an empty body for now?)

Enables: https://github.com/jetstack/preflight-private/issues/267

Sub-tasks

  • implement an agent command that gathers data and posts it to an endpoint on a given path
  • configure the endpoint and path in the config file

Acceptance Criteria:

  • there is an agent command I can run that gathers data and uploads it to the supplied endpoint

https://webhook.site can be used to test that the data is sent correctly.


Internal: Created from: https://github.com/jetstack/preflight-private/issues/260

[agent] Have agent run data gathers in its config

Users of the agent need to the agent to collect data, this should be configured based on the config file contents and work much the same as it does in preflight core

Sub-tasks

  • use preflight-core implementation to build similar data gather functionality in the agent
  • run the data gatherers configured in the config file

Acceptance Criteria:

  • for now, the agent prints the gathered data to the console in a json format.
  • only data gatherers in the config are run

Internal: Created from: https://github.com/jetstack/preflight-private/issues/260 Needs: https://github.com/jetstack/preflight-private/issues/264

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.