Git Product home page Git Product logo

iam-manager's Introduction

iam-manager

Maintenance PR slack

version Build Status codecov Go Report Card

AWS IAM role management for K8s namespaces inside cluster using k8s CRD Operator.

Security:

Security will be a main concern when we design a solution to create/update/delete IAM roles inside a cluster independently. iam-manager uses AWS IAM Permission Boundary concept along with other solutions to secure the implementation. Please check AWS Security for more details.

Supported Features

Following features are supported by IAM Manager

IAM Roles Management
IAM Role for Service Accounts (IRSA)
AWS Service-Linked Roles
Default Trust Policy for All Roles
Maximum Number of Roles per Namespace
Attaching Managed IAM Policies for All Roles
Multiple Trust policies

iam-manager config-map

This document provide explanation on configmap variables.

Additional Info

iam-manager is built using kubebuilder project and like any other kubebuilder project iam-manager also uses cert-manager to manage the SSL certs for webhooks.

Usage:

Following is the sample Iamrole spec.

apiVersion: iammanager.keikoproj.io/v1alpha1
kind: Iamrole
metadata:
  name: iam-manager-iamrole
spec:
  # Add fields here
  PolicyDocument:
    Statement:
      -
        Effect: "Allow"
        Action:
          - "s3:Get*"
        Resource:
          - "arn:aws:s3:::intu-oim*"
        Sid: "AllowS3Access"
  AssumeRolePolicyDocument:
    Version: "2012-10-17"
    Statement:
      -
        Effect: "Allow"
        Action: "sts:AssumeRole"
        Principal:
          AWS:
            - "arn:aws:iam::XXXXXXXXXXX:role/20190504-k8s-kiam-role"

To submit, kubectl apply -f iam_role.yaml --ns namespace1

Installation:

Simplest way to install iam-manager along with the role required for it to do the job is to run install.sh command.

Update the allowed policies in allowed_policies.txt and config map properties config_map as per your environment before you run install.sh.

Note: You must be cluster admin and have exported KUBECONFIG and also has Administrator access to underlying AWS account and have the credentials exported.

example:

export KUBECONFIG=/Users/myhome/.kube/admin@eks-dev2-k8s  
export AWS_PROFILE=admin_123456789012_account
./install.sh [cluster_name] [aws_region] [aws_profile]
./install.sh eks-dev2-k8s us-west-2 aws_profile

To enable web hook or/and also update your installation of iam-manager to work with kiam please check Installation for detailed instructions.

❤ Contributing ❤

Please see CONTRIBUTING.md.

Developer Guide

Please see DEVELOPER.md.

iam-manager's People

Contributors

carlyjiang avatar ccpeng avatar dependabot[bot] avatar diranged avatar draychev avatar kevdowney avatar kshamajain99 avatar lili-wan avatar mnkg561 avatar nrosenberg1 avatar shaoxt avatar tekenstam avatar wanghong230 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iam-manager's Issues

PolicyDocument.Statement[].Resource not allowing single element

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT

What happened:
AWS IAM Policy has a field Resource which can be single element or multiple elements(or array). iam-manager can unmarshal only if it is an array as Resource defined it as an array in the element.

What you expected to happen:
iam-manager must accept single element for Resource field along with array in the yaml file.

How to reproduce it (as minimally and precisely as possible):
create a role with Resource field having single element and you should see following error

mtvl15367e28a:playerdb nmogulla$ k apply -f /Users/nmogulla/Desktop/Eclipse_Workspace/GoProjects2/src/github.com/keikoproj/iam-manager/config/samples/iammanager_v1alpha1_iamrole.yaml
Error from server (InternalError): error when creating "/Users/nmogulla/Desktop/Eclipse_Workspace/GoProjects2/src/github.com/keikoproj/iam-manager/config/samples/iammanager_v1alpha1_iamrole.yaml": Internal error occurred: admission webhook "miamrole.kb.io" denied the request: v1alpha1.Iamrole.Spec: v1alpha1.IamroleSpec.PolicyDocument: v1alpha1.PolicyDocument.Statement: []v1alpha1.Statement: v1alpha1.Statement.Resource: []string: decode slice: expect [ or n, but found ", error found in #10 byte of ...|esource":"*"},{"Acti|..., bigger context ...|":["sts:AssumeRole"],"Effect":"Allow","Resource":"*"},{"Action":["ec2:Describe*"],"Effect":"Allow","|...
mtvl15367e28a:playerdb nmogulla$ 

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Load testing IAM manager

Is this a BUG REPORT or FEATURE REQUEST?:
Test

What happened:
We will use this issue to document load testing results

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Create CFN template for CloudWatch Rule and Lambda

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
For iam-manager role Lambda, we need a way to install Lambda as well as the Cloud Watch rule which triggers Lambda as part of CFN(Cloud Formation) Template.

Sample Event Rule:

{
  "source": [
    "aws.iam"
  ],
  "detail-type": [
    "AWS API Call via CloudTrail"
  ],
  "detail": {
    "eventSource": [
      "iam.amazonaws.com"
    ],
    "userIdentity": {
      "arn": [
        "arn:aws:sts::123456789012:assumed-role/k8s-iam-manager-role"
      ]
    }
  }

What you expected to happen:
Lambda and Cloud Watch event rule must be created just using CFN template

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Allow option to attach managed policies to all the roles

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
There might be a use case where organizations wants to add one or more policies to all the roles and we should allow that to be attached to all the roles managed by iam-manager.

What you expected to happen:
iam-manager must attach all the managed policies provided to all the roles with out user providing it explicitly for every role

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Delete role must check if role exists in the target account

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT

What happened:
While user is trying to delete the role and if for some reason iam role is already deleted in target account but same request come to reconciler again iam-manager is going into loop as it is not able to list the policies to a role which doesn't exist.

AWS IAM is throwing access denied error bcz role doesn't have iam-manager Tag(???)

2020-02-07T19:23:09.202Z	ERROR	awsapi.iam.DeleteRole	Unable to list attached managed policies for role	{"request_id": "b46b40e2-a620-487e-a22f-4714f16eb69c", "error": "AccessDenied: User: arn:aws:sts::000065563193:assumed-role/k8s-cluster-iam-manager-role/kiam-kiam is not authorized to perform: iam:ListAttachedRolePolicies on resource: role k8s-this-is-my-test-namespace\n\tstatus code: 403, request id: b3534f00-01de-4fa8-8f7f-75502a878986"}

What you expected to happen:
We should check whether the role exists or not before you delete a role.

How to reproduce it (as minimally and precisely as possible):
Try to delete a role in in k8s and AWS in almost same time

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

2020-02-07T19:23:09.111Z	INFO	controllers.iamrole_controller.Reconcile	Start of the request	{"request_id": "b46b40e2-a620-487e-a22f-4714f16eb69c"}
2020-02-07T19:23:09.111Z	INFO	controllers.iamrole_controller.Reconcile	Iamrole delete request	{"request_id": "b46b40e2-a620-487e-a22f-4714f16eb69c"}
2020-02-07T19:23:09.116Z	DEBUG	awsapi.iam.DeleteRole	Initiating api call	{"request_id": "b46b40e2-a620-487e-a22f-4714f16eb69c"}
2020-02-07T19:23:09.202Z	ERROR	awsapi.iam.DeleteRole	Unable to list attached managed policies for role	{"request_id": "b46b40e2-a620-487e-a22f-4714f16eb69c", "error": "AccessDenied: User: arn:aws:sts::000065563193:assumed-role/k8s-cluster-iam-manager-role/kiam-kiam is not authorized to perform: iam:ListAttachedRolePolicies on resource: role k8s-this-is-my-test-namespace\n\tstatus code: 403, request id: b3534f00-01de-4fa8-8f7f-75502a878986"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
github.com/keikoproj/iam-manager/pkg/awsapi.(*IAM).DeleteRole
	/workspace/pkg/awsapi/iam.go:428
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).Reconcile
	/workspace/controllers/iamrole_controller.go:93
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:216
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:192
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:171
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88

Status update should not requeue request for the reconciliation

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT

What happened:
As part of this PR [https://github.com//pull/82] testing, we found that there is a delay in setting the status of IamRole resource and due to that same request comes again with the status "" instead of Ready which makes IamRole creation again. Apart from the fact, how to gracefully handle that scenario which is being take care in #82 but the issue in the first place is that the request came back for reconciliation. That is due to the fact that resource status got updated and that resulted in a requeue. We already implemented predicate function to ignore the reconciliation for the status update but looks like it is not working as expected.

What you expected to happen:

Status update should not result in reconciliation event there by a delay in setting iam role status should not impact the IamRole creation.

How to reproduce it (as minimally and precisely as possible):
Its an intermittent issue but attaching the logs

Anything else we need to know?:

Environment:

  • iam-manager version
  • 0.0.6
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

2021-07-15T00:03:40.177Z DEBUG awsapi.iam.CreateRole Attaching Inline role policies {"request_id": "99ab2cab-dc2e-44d3-bf8e-e4d24db8f462", "roleName": "k8s-dev-patterns-iam-usw2-devp"}
2021-07-15T00:03:40.177Z DEBUG awsapi.iam.UpdateRole Initiating api call {"request_id": "99ab2cab-dc2e-44d3-bf8e-e4d24db8f462", "roleName": "k8s-dev-patterns-iam-usw2-devp"}
2021-07-15T00:03:40.270Z DEBUG awsapi.iam.UpdateRole Initiating api call {"request_id": "99ab2cab-dc2e-44d3-bf8e-e4d24db8f462", "roleName": "k8s-dev-patterns-iam-usw2-devp", "api": "UpdateAssumeRolePolicy"}
2021-07-15T00:03:40.373Z DEBUG awsapi.iam.UpdateRole AssumeRole Policy is successfully updated {"request_id": "99ab2cab-dc2e-44d3-bf8e-e4d24db8f462", "roleName": "k8s-dev-patterns-iam-usw2-devp"}
2021-07-15T00:03:40.373Z DEBUG awsapi.iam.AttachInlineRolePolicy Initiating api call {"request_id": "99ab2cab-dc2e-44d3-bf8e-e4d24db8f462", "roleName": "k8s-dev-patterns-iam-usw2-devp"}
2021-07-15T00:03:40.511Z DEBUG awsapi.iam.AttachInlineRolePolicy Successfully completed attaching InlineRolePolicy {"request_id": "99ab2cab-dc2e-44d3-bf8e-e4d24db8f462", "roleName": "k8s-dev-patterns-iam-usw2-devp"}
I0715 00:03:40.511563 1 event.go:281] Event(v1.ObjectReference{Kind:"Iamrole", Namespace:"dev-patterns-iam-usw2-devp", Name:"iamrole", UID:"0fe679b7-c327-4e04-843c-5ad9149f06e5", APIVersion:"iammanager.keikoproj.io/v1alpha1", ResourceVersion:"86803885", FieldPath:""}): type: 'Normal' reason: 'Ready' Successfully created/updated iam role
2021-07-15T00:03:40.521Z INFO controllers.iamrole_controller.HandleReconcile Successfully reconciled

^^ K8s event got created that State is ready but new request came backk again with State "" (You can see that in following logs)

{"request_id": "99ab2cab-dc2e-44d3-bf8e-e4d24db8f462", "iam_role_cr": "iamrole"}
2021-07-15T00:03:40.521Z INFO controllers.iamrole_controller.Reconcile Start of the request {"request_id": "96b21253-6307-4835-afed-dd7a798e8fe5"}
2021-07-15T00:03:40.521Z INFO controllers.iamrole_controller.HandleReconcile state of the custom resource {"request_id": "96b21253-6307-4835-afed-dd7a798e8fe5", "iam_role_cr": "iamrole", "state": ""}
2021-07-15T00:03:40.521Z DEBUG controllers.iamrole_controller.HandleReconcile roleName constructed successfully {"request_id": "96b21253-6307-4835-afed-dd7a798e8fe5", "iam_role_cr": "iamrole", "roleName": "k8s-dev-patterns-iam-usw2-devp"}
2021-07-15T00:03:40.521Z DEBUG internal.utils.utils.defaultTrustPolicy Default trust policy from cm {"request_id": "96b21253-6307-4835-afed-dd7a798e8fe5", "trust_policy": "{"Version": "2012-10-17","Statement": [{"Effect":

Provide an option to construct IAM ROLE name based on namespace

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
Provide an option to construct IAM role name based on the namespace it is part of. This should be allowed only if "iam.role.max.limit.per.namespace" = 1.

May be use a config map param to configure it but make the other option which is construct the iam role name based on the name of the custom resource as a default option. We could go one level up and see if we can include namespace and name in iam role construction so it will be unique across the cluster.

What you expected to happen:
If config map is configured in a way to construct the name based on namespace, name should be constructed based on the namespace and not the name

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Improve logging

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
Improve the logging with enough details

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Provide the iam-manager updates via K8s events

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
There might be a scenario where iam-manager requests might get failed and even though iam-manager retries it is benificial to add the error message/successful updates to the events so it will help user to understand the error as part of the troubleshooting

What you expected to happen:
When user does kubectl get events --> it should show what is going on inside iam-manager whether it is successful or failure requests.

How to reproduce it (as minimally and precisely as possible):
Provide a wrong policy action and try to create iamrole custom resource. This will result in awsapi error response and iam-manager keeps retrying without providing the reason to the user why the request is failing. Only users with cluster admin access can see iam-manager logs to understand but that should not be helpful to the users who have only namespace admin access.

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Handle Reconcile gracefully

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT

What happened:
There might be a scenarios where while custom resource being created and after setting the status to "Inprogress" controller might go down or being restarted. As part of the present logic, if the status of the resource is "..Inprogress", iam-manager do not do anything and that might result in that status forever.

What you expected to happen:
Custom resource must be in desired status irrespective of controller status. Operation must be idempotent and as soon as controller comes up custom resource should be in READY status.

How to reproduce it (as minimally and precisely as possible):
Create a new iamrole custom resource and bring the controller down before custom resource is created completely.

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Monitor AWS iam-manager role

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
As per the design, we allow access only to cluster administrator to iam-manager namespace but iam role attached to iam-manager namespace (in the kiam installation) will have a loop hole where "anything" deployed in iam-manager namespace will have access to the iam role attached to the namespace and a user with malicious intent can create a new pod and start creating iam roles bypassing webhook validation and controller.

What you expected to happen:
We should monitor the activities by iam-manager role and detect if there is any anomaly and perform an action to stop damage further. Idea here is to "Attach a Deny all policies".

How to reproduce it (as minimally and precisely as possible):
Create a pod inside iam-manager and create roles.

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Include LastUpdatedTime stamp in the status

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
When we do kubectl get iam by default it shows only "AGE" which is the time lapsed between iam role creation.

Along with this, it might be useful to include RoleARN, RoleID in the status.

What you expected to happen:
It would be nice to include the lastUpdatedTimestamp so users knows when was the IAM Policy updated last time.

How to reproduce it (as minimally and precisely as possible):
do kubectl get iam
Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Enhance IAM policy validation

Is this a BUG REPORT or FEATURE REQUEST?:

What happened:
Currently, in order to validate an iam policy we ask administrator to configure two different parameters (allowedPolicyAction and restrictedS3Resource). After which, for a policy action we loop through list of allowed policy actions. Once found in allowed list we check that if it is a S3 related action, we make sure that in is not in the list of restricted s3 resources.

In future, someone might want to restricted another resource lets say route53. Then, instead of adding one more parameter for restricted route53 resource in config map. We should enhance validation logic to take one document for validIAMPolicy which contains all whitelist and restricted resources.

What you expected to happen:
We should make our validation logic more scalable.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Support Kubernetes 1.22+

Is this a BUG REPORT or FEATURE REQUEST?:

FEATURE REQUEST

What happened:

We've gotten ourselves into a pickle... we were on Kubernetes (EKS) 1.21, and we have upgraded to 1.22. We are now finding our iam-manager pods failing, likely due to incompatible client/API libraries:

2022-05-12T03:32:09.245Z	ERROR	controller-runtime.webhook.webhooks	unable to decode the request	{"webhook": "/validate-iammanager-keikoproj-io-v1alpha1-iamrole", "error": "no kind \"AdmissionReview\" is registered for version \"admission.k8s.io/v1\" in scheme \"pkg/runtime/scheme.go:101\""}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/webhook/admission.(*Webhook).ServeHTTP
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/webhook/admission/http.go:79
sigs.k8s.io/controller-runtime/pkg/webhook.instrumentedHook.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/webhook/server.go:129
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:2036
net/http.(*ServeMux).ServeHTTP
	/usr/local/go/src/net/http/server.go:2416
net/http.serverHandler.ServeHTTP
	/usr/local/go/src/net/http/server.go:2831
net/http.(*conn).serve
	/usr/local/go/src/net/http/server.go:1919
2022-05-12T03:32:09.245Z	DEBUG	controller-runtime.webhook.webhooks	wrote response	{"webhook": "/validate-iammanager-keikoproj-io-v1alpha1-iamrole", "UID": "", "allowed": false, "result": {}, "resultError": "got runtime.Object without object metadata: &Status{ListMeta:ListMeta{SelfLink:,ResourceVersion:,Continue:,RemainingItemCount:nil,},Status:,Message:no kind \"AdmissionReview\" is registered for version \"admission.k8s.io/v1\" in scheme \"pkg/runtime/scheme.go:101\",Reason:,Details:nil,Code:400,}"}

What you expected to happen:

Well ... I had hoped it would work. :)

How to reproduce it (as minimally and precisely as possible):

Try to run the iam-manager on a Kubernetes 1.22+ cluster.

Trust role hard coded

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT
What happened:
At the moment, Any new role is created is having only one role as part of trust policy and looks like it is hard coded for POC. Allow master role to be included in the trust policy from config map.
What you expected to happen:
What roles to be added in the role trust policy should be controlled by environment and can be used config map to control that parameter.
How to reproduce it (as minimally and precisely as possible):
create a role and check the trust policy
Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Allow overwriting the role name construction in the CR itself

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
There might be use cases where iam-manager is configured to construct the name based on namespace or name but administrator might want some exceptional cases which doesn't follow the default behavior.

We probably should allow people to overwrite the naming behavior in the CR itself. May be as part of annotation or spec Or use the combination to overwrite the behavior.

What you expected to happen:
Due to unforeseen reasons, if administrator wants to create a role which doesn't follow the configured naming construction, we should allow it.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Unable to update status

Is this a BUG REPORT or FEATURE REQUEST?:
Bug

What happened:
IAM policy was updated by adding a forbidden policy. But, status was not updated to PolicyNotAllowed. Instead controller was returning following error

2020-09-25T00:23:35.892Z	ERROR	controllers.iamrole_controller.UpdateStatus	Unable to update status	{"request_id": "41e0254d-a604-41e6-a6c4-13e6cc81850f", "status": "PolicyNotAllowed", "error": "Iamrole.iammanager.keikoproj.io \"iamrole\" is invalid: status.lastUpdatedTimestamp: Invalid value: \"null\": status.lastUpdatedTimestamp in body must be of type string: \"null\""}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).UpdateStatus
	/workspace/controllers/iamrole_controller.go:360
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).HandleReconcile
	/workspace/controllers/iamrole_controller.go:144
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).Reconcile
	/workspace/controllers/iamrole_controller.go:98
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88```

**What you expected to happen**:
CR status should have updated to PolicyNotAllowed

**Environment**:
- iam-manager version
- Kubernetes version :

$ kubectl version -o yaml

Max limit per namespace config not working

Is this a BUG REPORT or FEATURE REQUEST?:
Bug Report

What happened:
Set iam.role.max.limit.per.namespace to 5 and added a second IAM role to a namespace. When I attempted to create the role (via ArgoCD) the resource did not sync and I see this in the logs for the IAM Manager pod:

2021-05-04T23:22:55.964Z DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/validate-iammanager-keikoproj-io-v1alpha1-iamrole", "UID": "8d8acd6f-5985-4c38-a709-0edc5c2db644", "allowed": false, "result": {}, "resultError": "got runtime.Object without object metadata: &Status{ListMeta:ListMeta{SelfLink:,ResourceVersion:,Continue:,RemainingItemCount:nil,},Status:,Message:,Reason:Iamrole.iammanager.keikoproj.io "kafka-connect-s3-sink-connector-role" is invalid: metadata.namespace: Invalid value: "lightstream": only 1 role is allowed per namespace,Details:nil,Code:403,}"}

What you expected to happen:

The second IAM role should have been created.

How to reproduce it (as minimally and precisely as possible):

Set iam.role.max.limit.per.namespace > 1 and attempt to create a second role in a namespace.

Anything else we need to know?:

I believe the issue is on this line
https://github.com/keikoproj/iam-manager/blob/master/api/v1alpha1/iamrole_webhook.go#L229

which should be checking config.MaxRolesAllowed similar to this:

if config.Props.MaxRolesAllowed() < len(iamRoles.Items) {

Environment:

  • iam-manager version
    0.0.6
  • Kubernetes version :
clientVersion:
  buildDate: "2021-02-21T20:21:49Z"
  compiler: gc
  gitCommit: e87da0bd6e03ec3fea7933c4b5263d151aafd07c
  gitTreeState: clean
  gitVersion: v1.20.4
  goVersion: go1.15.8
  major: "1"
  minor: "20"
  platform: darwin/amd64
serverVersion:
  buildDate: "2020-12-23T22:10:21Z"
  compiler: gc
  gitCommit: 49a6c0bf091506e7bafcdb1b142351b69363355a
  gitTreeState: clean
  gitVersion: v1.19.6-eks-49a6c0
  goVersion: go1.15.5
  major: "1"
  minor: 19+
  platform: linux/amd64

Other debugging information (if applicable):

Relevant bit of controller logs:

2021-05-04T23:22:55.956Z	DEBUG	controller-runtime.webhook.webhooks	received request	{"webhook": "/validate-iammanager-keikoproj-io-v1alpha1-iamrole", "UID": "8d8acd6f-5985-4c38-a709-0edc5c2db644", "kind": "iammanager.keikoproj.io/v1alpha1, Kind=Iamrole", "resource": {"group":"iammanager.keikoproj.io","version":"v1alpha1","resource":"iamroles"}}
2021-05-04T23:22:55.956Z	INFO	v1alpha1.ValidateCreate	validating create request	{"name": "kafka-connect-s3-sink-connector-role"}
2021-05-04T23:22:55.957Z	INFO	v1alpha1.validateIAMPolicy	validating IAM policy	{"name": "kafka-connect-s3-sink-connector-role"}
2021-05-04T23:22:55.957Z	DEBUG	k8s.client.IamrolesCount	list api call
2021-05-04T23:22:55.964Z	INFO	k8s.client.IamrolesCount	Total number of roles	{"count": 1}
2021-05-04T23:22:55.964Z	DEBUG	controller-runtime.webhook.webhooks	wrote response	{"webhook": "/validate-iammanager-keikoproj-io-v1alpha1-iamrole", "UID": "8d8acd6f-5985-4c38-a709-0edc5c2db644", "allowed": false, "result": {}, "resultError": "got runtime.Object without object metadata: &Status{ListMeta:ListMeta{SelfLink:,ResourceVersion:,Continue:,RemainingItemCount:nil,},Status:,Message:,Reason:Iamrole.iammanager.keikoproj.io \"kafka-connect-s3-sink-connector-role\" is invalid: metadata.namespace: Invalid value: \"lightstream\": only 1 role is allowed per namespace,Details:nil,Code:403,}"}

awsapi unit test cases

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
Improve the unit test coverage by writing unit test cases for awsapi package

What you expected to happen:
Test coverage should be more than 85% for awsapi package

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Bug: Your "keikoproj/iam-manager:latest" image is bad

Is this a BUG REPORT or FEATURE REQUEST?:

This is a bug report

What happened:

The keikoproj/iam-manager:latest image is referenced in the various resource templates for this project. This image is bad. It comes up and runs, but it appears to be based off someone's custom branch (perhaps your own?) that has a whole bunch of things hard coded in it. It was trying to create IAM roles and policies with hard-coded AWS Account IDs. When we initially tested it we were actually sure it was a trojan horse.

I strongly recommend you take that image down.

How to reproduce it (as minimally and precisely as possible):

Pull down the image and run it in a new AWS account id..watch it fail.

MalformedPolicyDocument: Invalid principal in policy: "AWS":"arn:aws:iam::000065563193:role/masters.ops-prim-ppd.cluster.k8s.local"

Set properties to context

Is this a BUG REPORT or FEATURE REQUEST?:
Feature

What happened:
Currently, we are setting properties to global variables. Instead, we should set properties to a "properties" struct in context.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Bug: iam.managed.policies is required

Is this a BUG REPORT or FEATURE REQUEST?:

Bug Report

What happened:

The docs say that iam.managed.policies is optional. However, it seems required right now in the v0.0.6 release. If you leave it unset, you get these failures:

I1218 06:31:56.616501       1 event.go:281] Event(v1.ObjectReference{Kind:"Iamrole", Namespace:"iam-manager-system", Name:"iam-manager-iamrole-irsa", UID:"407b984d-8ce1-40e7-a342-071c1cd515b4", APIVersion:"iammanager.keikoproj.io/v1alpha1", ResourceVersion:"12755025", FieldPath:""}): type: 'Warning' reason: 'Error' Unable to create/update iam role due to error InvalidInput: ARN arn:aws:iam::xxx:policy/ is not valid.

What you expected to happen:

Simply no AttachRolePolicy IAM calls would be made if there was no default role configured in the configmap.

Anything else we need to know?:

Once I created a blank IAM Policy and configured it as the default iam.managed.policies setting, we were fine.

Webhook enablement should be a CLI flag

Is this a BUG REPORT or FEATURE REQUEST?:

Changing webhook.enabled from false to true seems like it needs to be a CLI argument that triggers a new deployment of the app.

What happened:

When you just switch the configmap from webhook.enabled: 'false' to webhook.enabled: 'true', the iam-manager reloads the configmap but it does not start listening on port 9443. However, it DOES create the registration with the API. This causes new requests for those resources to start failing. It seems like this should just be a CLI argument rather than a configmap setting because it fundamentally alters the way the code behaves and requires the code to begin listening on a new port.

Add Namespace and cluster names as part of the Tags

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
AWS IAM is a Global service and must have all the iam roles across regions unique. There might be a chance where 2 clusters deployed in a region(or different regions) in the same account and Iam role name in cluster B might use a same name as one of the cluster A iam resource hence we need to be sure to avoid overwriting Cluster A IAM role with Cluster B IAM role.

This can be avoided by adding cluster name and namespace names to the IAM Role Tags and if the role is already exist we must check the namespace is matching with the existing tag else respond back to the user with "Role Name already taken and must use different name"

What you expected to happen:
IAM Role should not be overwritten if same name being used by two different namespaces from 2 different clusters.

How to reproduce it (as minimally and precisely as possible):

  1. Create Cluster A in us-west-2
  2. Create IAM role CR with name "sample-role". This will create "k8s-sample-role"
  3. Create Cluster B in us-east-2
  4. Create IAM role CR in Cluster B with name "sample-role". This will overwrite the Cluster A k8s-sample-role with policy

Anything else we need to know?:

Environment:

  • iam-manager version: 0.0.3
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Documentation Request: How to correctly run the test suite

Is this a BUG REPORT or FEATURE REQUEST?:

Documentation request

What happened:

It took a while for me to figure out how to get the test suite up and running. I don't have it perfect yet, and I wonder what requirements there really are. It would be good to clean up the developer documentation so that an outside developer can quickly get started.

How to reproduce it (as minimally and precisely as possible):

Here's the current failure I am seeing:

[diranged@ip-192-168-208-11 iam-manager ]$ KUBECONFIG=~/.kube/config KUBERNETES_SERVICE_HOST=foo KUBERNETES_SERVICE_PORT=bar LOCAL=true make test
 setting up env variables for test…
go get -u github.com/golang/mock/mockgen
go: found github.com/golang/mock/mockgen in github.com/golang/mock v1.4.4
go: golang.org/x/xerrors upgrade => v0.0.0-20200804184101-5ec99f83aff1
go: golang.org/x/tools upgrade => v0.0.0-20201218024724-ae774e9781d2
go: golang.org/x/mod upgrade => v0.4.0
mockgen is in progess
/Users/diranged/go/bin/controller-gen object:headerFile=./hack/boilerplate.go.txt paths="./..."
go fmt ./...
controllers/iamrole_controller_test.go
/Users/diranged/go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
/Users/diranged/go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd_no_webhook/bases
go test ./... -coverprofile cover.out
?   	github.com/keikoproj/iam-manager	[no test files]
?   	github.com/keikoproj/iam-manager/api/v1alpha1	[no test files]
OK: 2 passed
Running Suite: Controller Suite
===============================
Random Seed: 1608319983
Will run 3 of 3 specs

STEP: bootstrapping test environment
2020-12-18T11:33:04.323-0800	DEBUG	controller-runtime.test-env	starting control plane	{"api server flags": []}
2020-12-18T11:33:04.333-0800	ERROR	controller-runtime.test-env	unable to start the controlplane	{"tries": 0, "error": "fork/exec /usr/local/kubebuilder/bin/etcd: no such file or directory"}
github.com/go-logr/zapr.(*zapLogger).Error
	/Users/diranged/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).startControlPlane
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:270
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).Start
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:232
github.com/keikoproj/iam-manager/controllers_test.glob..func2
	/Users/diranged/go/src/github.com/keikoproj/iam-manager/controllers/suite_test.go:59
reflect.Value.call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:476
reflect.Value.Call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:337
github.com/onsi/ginkgo/internal/leafnodes.newRunner.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:49
github.com/onsi/ginkgo/internal/leafnodes.(*runner).runAsync.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:86
2020-12-18T11:33:04.336-0800	ERROR	controller-runtime.test-env	unable to start the controlplane	{"tries": 1, "error": "fork/exec /usr/local/kubebuilder/bin/etcd: no such file or directory"}
github.com/go-logr/zapr.(*zapLogger).Error
	/Users/diranged/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).startControlPlane
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:270
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).Start
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:232
github.com/keikoproj/iam-manager/controllers_test.glob..func2
	/Users/diranged/go/src/github.com/keikoproj/iam-manager/controllers/suite_test.go:59
reflect.Value.call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:476
reflect.Value.Call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:337
github.com/onsi/ginkgo/internal/leafnodes.newRunner.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:49
github.com/onsi/ginkgo/internal/leafnodes.(*runner).runAsync.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:86
2020-12-18T11:33:04.340-0800	ERROR	controller-runtime.test-env	unable to start the controlplane	{"tries": 2, "error": "fork/exec /usr/local/kubebuilder/bin/etcd: no such file or directory"}
github.com/go-logr/zapr.(*zapLogger).Error
	/Users/diranged/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).startControlPlane
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:270
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).Start
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:232
github.com/keikoproj/iam-manager/controllers_test.glob..func2
	/Users/diranged/go/src/github.com/keikoproj/iam-manager/controllers/suite_test.go:59
reflect.Value.call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:476
reflect.Value.Call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:337
github.com/onsi/ginkgo/internal/leafnodes.newRunner.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:49
github.com/onsi/ginkgo/internal/leafnodes.(*runner).runAsync.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:86
2020-12-18T11:33:04.343-0800	ERROR	controller-runtime.test-env	unable to start the controlplane	{"tries": 3, "error": "fork/exec /usr/local/kubebuilder/bin/etcd: no such file or directory"}
github.com/go-logr/zapr.(*zapLogger).Error
	/Users/diranged/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).startControlPlane
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:270
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).Start
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:232
github.com/keikoproj/iam-manager/controllers_test.glob..func2
	/Users/diranged/go/src/github.com/keikoproj/iam-manager/controllers/suite_test.go:59
reflect.Value.call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:476
reflect.Value.Call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:337
github.com/onsi/ginkgo/internal/leafnodes.newRunner.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:49
github.com/onsi/ginkgo/internal/leafnodes.(*runner).runAsync.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:86
2020-12-18T11:33:04.346-0800	ERROR	controller-runtime.test-env	unable to start the controlplane	{"tries": 4, "error": "fork/exec /usr/local/kubebuilder/bin/etcd: no such file or directory"}
github.com/go-logr/zapr.(*zapLogger).Error
	/Users/diranged/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).startControlPlane
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:270
sigs.k8s.io/controller-runtime/pkg/envtest.(*Environment).Start
	/Users/diranged/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/envtest/server.go:232
github.com/keikoproj/iam-manager/controllers_test.glob..func2
	/Users/diranged/go/src/github.com/keikoproj/iam-manager/controllers/suite_test.go:59
reflect.Value.call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:476
reflect.Value.Call
	/usr/local/Cellar/go/1.15.5/libexec/src/reflect/value.go:337
github.com/onsi/ginkgo/internal/leafnodes.newRunner.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:49
github.com/onsi/ginkgo/internal/leafnodes.(*runner).runAsync.func1
	/Users/diranged/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:86
Failure [0.024 seconds]
[BeforeSuite] BeforeSuite 
/Users/diranged/go/src/github.com/keikoproj/iam-manager/controllers/suite_test.go:50

  Unexpected error:
      <*fmt.wrapError | 0xc000714c00>: {
          msg: "failed to start the controlplane. retried 5 times: fork/exec /usr/local/kubebuilder/bin/etcd: no such file or directory",
          err: {
              Op: "fork/exec",
              Path: "/usr/local/kubebuilder/bin/etcd",
              Err: 0x2,
          },
      }
      failed to start the controlplane. retried 5 times: fork/exec /usr/local/kubebuilder/bin/etcd: no such file or directory
  occurred

  /Users/diranged/go/src/github.com/keikoproj/iam-manager/controllers/suite_test.go:60
------------------------------

Ran 3 of 0 Specs in 0.025 seconds
FAIL! -- 0 Passed | 3 Failed | 0 Pending | 0 Skipped
--- FAIL: TestAPIs (0.02s)
FAIL
coverage: 2.3% of statements
FAIL	github.com/keikoproj/iam-manager/controllers	1.764s
ok  	github.com/keikoproj/iam-manager/internal/config	0.444s	coverage: 69.6% of statements
ok  	github.com/keikoproj/iam-manager/internal/utils	1.419s	coverage: 82.8% of statements
ok  	github.com/keikoproj/iam-manager/pkg/awsapi	1.478s	coverage: 86.0% of statements
?   	github.com/keikoproj/iam-manager/pkg/awsapi/mocks	[no test files]
?   	github.com/keikoproj/iam-manager/pkg/k8s	[no test files]
?   	github.com/keikoproj/iam-manager/pkg/log	[no test files]
ok  	github.com/keikoproj/iam-manager/pkg/validation	0.425s	coverage: 87.4% of statements
FAIL
make: *** [test] Error 1

Bug: iam.policy.resource.blacklist requires a list of strings

Is this a BUG REPORT or FEATURE REQUEST?:

Bug Report

What happened:

The docs say that iam.policy.resource.blacklist is optional. However if you leave it unset, then the code defaults (I think) to it being *. This causes ALL policies to fail - even your example resource:

I1218 01:38:04.716046       1 event.go:281] Event(v1.ObjectReference{Kind:"Iamrole", Namespace:"iam-manager-system", Name:"iam-manager-iamrole-irsa", UID:"b6dc3100-6203-4bfa-9009-ed91af187f4a", APIVersion:"iammanager.keikoproj.io/v1alpha1", ResourceVersion:"12568066", FieldPath:""}): type: 'Warning' reason: 'PolicyNotAllowed' Unable to create/update iam role due to error spec.PolicyDocument.Resource: Forbidden: restricted resource arn:aws:s3:::mybucket* included in the request

What you expected to happen:

It should have been allowed.

Anything else we need to know?:

Setting this policy to an invalid string ("nil") works for us for now as a workaround.

Bug: Docs say aws.accountId is optional, its not

Is this a BUG REPORT or FEATURE REQUEST?:

While you can avoid setting aws.accountId if you want .. it breaks the code behavior when it comes time to attach a new policy to a role or create a role. This is because you craft the ARNs in-code, and you need the Account ID for this.

What happened:

I initially left aws.accountId unset. However then new CreateRole calls were failing. When I dug in, it was because the permissions boundary ARN was being incorrectly created. Here is a snippet of the cloudtrail log:

{
	"id": "AQAAAXZ0dwWw8YasKgAAAABBWFowZk8zdkFBQ05XckhoelBXNENBQUE",
	"content": {
		"timestamp": "2020-12-18T06:08:46.000Z",
		"tags": [
			"source:aws:cloudtrail"
		],
		"attributes": {
			"eventID": "2cd2e43f-d850-4652-86f8-04ef97233e24",
			"metadata": {
				"awsregion": "us-east-1"
			},
			"aws_account": "...",
			"eventSource": "iam.amazonaws.com",
			"errorCode": "NoSuchEntityException",
			"eventName": "CreateRole",
			"http": {
				"user_agent_details": {
					"os": {
						"family": "Linux"
					},
					"browser": {
						"family": "aws-sdk-go",
						"patch": "38",
						"major": "1",
						"minor": "25"
					},
					"device": {
						"family": "Other",
						"category": "Desktop"
					}
				},
				"user_agent": "aws-sdk-go/1.25.38 (go1.13.15; linux; amd64)"
			},
			"userAgent": "aws-sdk-go/1.25.38 (go1.13.15; linux; amd64)",
			"userIdentity": {
				"accessKeyId": "ASIAVJ6....",
				"sessionContext": {
					"sessionIssuer": {
						"principalId": "...",
						"accountId": "...",
						"type": "Role",
						"arn": "arn:aws:iam::...:role/EKS-dev1-test-stage2-IamManagerController-IamRole-O0VKOJA39G2Y",
						"userName": "EKS-dev1-test-stage2-IamManagerController-IamRole-O0VKOJA39G2Y"
					},
					"webIdFederationData": {
						"federatedProvider": "arn:aws:iam::...:oidc-provider/oidc.eks.us-west-2.amazonaws.com/id/..."
					},
					"attributes": {
						"mfaAuthenticated": "false",
						"creationDate": "2020-12-18T05:47:25Z"
					}
				},
				"accountId": "...",
				"principalId": "...:1608270445319761940",
				"type": "AssumedRole",
				"arn": "arn:aws:sts::...:assumed-role/EKS-dev1-test-stage2-IamManagerController-IamRole-O0VKOJA39G2Y/1608270445319761940"
			},
			"eventType": "AwsApiCall",
			"type": "aws:cloudtrail",
			"eventCategory": "Management",
			"eventVersion": "1.08",
			"sourceIPAddress": "...",
			"errorMessage": "Scope ARN: arn:aws:iam:::policy/EKS-dev1-test-stage2-IamManagerController-9GZSJTSP5T78-permissions-boundary does not exist or is not attachable.",
			"requestParameters": {
				"permissionsBoundary": "arn:aws:iam:::policy/EKS-dev1-test-stage2-IamManagerController-9GZSJTSP5T78-permissions-boundary",
				"assumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Action\":\"sts:AssumeRoleWithWebIdentity\",\"Principal\":{\"Federated\":\"arn:aws:iam::...:oidc-provider/oidc.eks.us-west-2.amazonaws.com/id/...\"},\"Condition\":{\"StringEquals\":{\"oidc.eks.us-west-2.amazonaws.com/id/...\":\"system:serviceaccount:iam-manager-system:test-iam-service-account\"}}}]}",
				"maxSessionDuration": 43200,
				"roleName": "k8s-iam-manager-system",
				"description": "#DO NOT DELETE#. Managed by iam-manager"
			},
			"readOnly": false,
			"requestID": "428322e4-74fc-42fb-898f-ae847bb821c6",
			"eventTime": "2020-12-18T06:08:46Z",
			"recipientAccountId": "...",
			"managementEvent": true,
			"timestamp": "2020-12-18T06:08:46Z"
		}
	}
}

What you expected to happen:

The code would auto-detect the account ID.

How to reproduce it (as minimally and precisely as possible):

Leave this setting unset. Create a role. Check your CloudTrail logs when things fail.

Should we support creating instance profile as well?

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
iam-manager creates iam-role which can be assumed by application code but should we support creating instance-profile as well where that role can be attached to a node/node group

What you expected to happen:
As a developer if i'm creating a node group and underlying operator might not have access to create iam roles. That operator (aka instance-manager in this case) can create a custom resource to get an iam role iam-manager and can be used to pass it to the instance manager.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Support IRSA (IAM Role for Service Accounts)

Is this a BUG REPORT or FEATURE REQUEST?:
Feature Request

What happened:
AWS released new feature (IAM Role for Service Account) to get the AWS credentials based on ProjectedTokens k8s concept. This basically changes the way role can be assumed at the run time. We no longer need KIAM to assume the role and EKS is supporting this out of box.

To support this feature, iam-manager

  1. Should add the trust policy with cluster OIDC info.
    aws eks describe-cluster --name cluster_name --query "cluster.identity.oidc.issuer" --output text
    Final trust policy should look like
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": "system:serviceaccount:namespace:service-account-name"
        }
      }
    }
  ]
}
EOF
echo "${TRUST_RELATIONSHIP}" > trust.json
  1. Update the service account with required annotation
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::AWS_ACCOUNT_ID:role/IAM_ROLE_NAME

We can probably support this feature based on given annotation(TBD)

  1. If the annotation is present in the CR, iam-manager should make a call to AWS to get the OIDC info and add it as part of the trust policy.
  2. Also add the required annotation to the service account mentioned in the annotation (Should we create the Service Account if not exists?)

What you expected to happen:
Attaching an annotation to IAM CR should automatically provision all the required trust policy and Annotation to the service account so that any pods which use that service account should automatically able to assume the role using IRSA method.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:
More info:

  1. https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
  2. https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Race condition exists on rapid delete/create of Iamroles, or when IAM Role already exists in AWS

Is this a BUG REPORT or FEATURE REQUEST?:

This is a bug report, with a fix at #82.

What happened:

We began running into a situation where our ServiceAccount annotations would be set with eks.amazonaws.com/role-arn: "". After digging into the code, we discovered that the default case handler in the controller has this if statement that is problematic. I think its intent was to handle an intermittent failure in the READY handler, where the ServiceAccount was already setup properly... but what it fails to do is handle the situation where the IAMRole already exists.

Here is the rundown of events that can cause this failure:

  1. A new Iamrole is created.. and the controller reconciliation starts.
  2. The r.IAMClient.CreateRole function is called.
    1. The function sets roleAlreadyExists=true and logs the error.
    2. Because roleAlreadyExists, the code then skips populating the IAMRoleResponse object.
  3. The handler checks if resp.RoleARN is set. It is not because we got an empty response object.
  4. The handler tries to use the previously known good state in iamRole.Status.RoleARN ... but remember, this is a newly created Iamrole resource, so that is empty too.
  5. The ServiceAccount is populated with an empty string annotation.

What you expected to happen:

I expect the code to realize that the IAM Role in AWS is indeed the one we are trying to use (likely previously created but perhaps the reconciliation loop did not complete, OR a fast create/delete was called on the Iamrole resource by a tool like ArgoCD, or any other number of intermittent situations). Either way, i expect eventual consistency to work out the issue.

How to reproduce it (as minimally and precisely as possible):

Pre-create an IAM Role... then try creating an Iamrole resource.

Anything else we need to know?:

While digging into this, I also noticed that there is no reconciliation that happens for the ServiceAccount annotation. I have a second PR (prepped at Nextdoor#4) that significantly revamps the k8s/rbac.go package to be more testable, and implements regular reconciliation of the ServiceAccount annotation. We've been running the combination of these two in production now for 2 weeks and we've had great success.

ServiceAccount resource is not considered as part of the reconciliation loop

Is this a BUG REPORT or FEATURE REQUEST?:

Bug reeport

What happened:

We have noticed that there is no reconciliation loop for ensuring that the ServiceAccount resource is created (if desired) and that the ISRA annotations are in place. This means that you have one chance and one chance only to get that created, and if anything breaks it later, you are out of luck.

What you expected to happen:

I expect the controller to continually work to ensure the desired state of the world is the state discovered in the Kubernetes API.

How to reproduce it (as minimally and precisely as possible):

Create a new Iamrole resource that creates a matching ServiceAccount resource. Then go and delete that ServiceAccount resource. You will find that it is not re-created or checked at any point. Same thing if you change, delete, or update the ISRA annotation.

Anything else we need to know?:

This was discovered as part of #83 ...

Ignore the validation in case of Deny

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT

What happened:
At present, we validate the IAM Policy actions based on allowed policies whitelist from config map irrespective of Allow or Deny. The validation should happen only on "Allow" action policies and not on Deny.

What you expected to happen:
User can include anything in "Deny" policy.

How to reproduce it (as minimally and precisely as possible):
Try to create ec2.* in Deny and gam-manager will reject the request

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Delete role Logging throws an exception while validating after successful deletion of the role

Is this a BUG REPORT or FEATURE REQUEST?:
Bug Report

What happened:
When a delete role is triggered after successful deletion of object, logs shows an error validating the deleted object.

2020-01-31T04:39:43.525Z	ERROR	controllers.iamrole_controller.Reconcile	unable to get iam resource from api server. ignoring it as the event will be back once its available	{"request_id": "f3b91454-a799-4140-b427-d22b1e869922", "error": "Iamrole.iammanager.keikoproj.io \"iamrole-sample-3\" not found"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).Reconcile
	/workspace/controllers/iamrole_controller.go:65
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:216
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:192
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:171
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88

What you expected to happen:
Logs should specify a successful delete

How to reproduce it (as minimally and precisely as possible):
create and delete a role, capture the logs with kubectl logs -f <iam-manager_pod>

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

2020-01-31T04:39:42.563Z	INFO	controllers.iamrole_controller.Reconcile	Start of the request	{"request_id": "722c91ab-708a-4f2c-9f5d-44d7ea273d45"}
2020-01-31T04:39:42.563Z	INFO	controllers.iamrole_controller.Reconcile	Received a Iamrole delete request	{"request_id": "722c91ab-708a-4f2c-9f5d-44d7ea273d45"}
2020-01-31T04:39:42.610Z	DEBUG	awsapi.iam.DeleteRole	Initiating api call	{"request_id": "722c91ab-708a-4f2c-9f5d-44d7ea273d45"}
2020-01-31T04:39:42.990Z	DEBUG	awsapi.iam.DeleteRole	Attached managed for role	{"request_id": "722c91ab-708a-4f2c-9f5d-44d7ea273d45", "policyList": null}
2020-01-31T04:39:43.153Z	DEBUG	awsapi.iam.DeleteInlinePolicy	Initiating api call	{"request_id": "722c91ab-708a-4f2c-9f5d-44d7ea273d45"}
2020-01-31T04:39:43.289Z	DEBUG	awsapi.iam.DeleteInlinePolicy	Successfully deleted inline policy	{"request_id": "722c91ab-708a-4f2c-9f5d-44d7ea273d45"}
2020-01-31T04:39:43.437Z	DEBUG	awsapi.iam.DeleteRole	Successfully deleted the role	{"request_id": "722c91ab-708a-4f2c-9f5d-44d7ea273d45"}
2020-01-31T04:39:43.437Z	INFO	controllers.iamrole_controller.Reconcile	Removing finalizer from Iamrole	{"request_id": "722c91ab-708a-4f2c-9f5d-44d7ea273d45"}
2020-01-31T04:39:43.525Z	DEBUG	controller-runtime.controller	Successfully Reconciled	{"controller": "iamrole", "request": "default/iamrole-sample-3"}
2020-01-31T04:39:43.525Z	INFO	controllers.iamrole_controller.Reconcile	Start of the request	{"request_id": "f3b91454-a799-4140-b427-d22b1e869922"}
2020-01-31T04:39:43.525Z	ERROR	controllers.iamrole_controller.Reconcile	unable to get iam resource from api server. ignoring it as the event will be back once its available	{"request_id": "f3b91454-a799-4140-b427-d22b1e869922", "error": "Iamrole.iammanager.keikoproj.io \"iamrole-sample-3\" not found"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).Reconcile
	/workspace/controllers/iamrole_controller.go:65
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:216
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:192
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:171
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88
2020-01-31T04:39:43.525Z	DEBUG	controller-runtime.controller	Successfully Reconciled	{"controller": "iamrole", "request": "default/iamrole-sample-3"}
2020-01-31T04:39:43.525Z	INFO	controllers.iamrole_controller.Reconcile	Start of the request	{"request_id": "4bcfdb1e-76c3-4818-a2df-8deb09f50ba7"}
2020-01-31T04:39:43.525Z	ERROR	controllers.iamrole_controller.Reconcile	unable to get iam resource from api server. ignoring it as the event will be back once its available	{"request_id": "4bcfdb1e-76c3-4818-a2df-8deb09f50ba7", "error": "Iamrole.iammanager.keikoproj.io \"iamrole-sample-3\" not found"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).Reconcile
	/workspace/controllers/iamrole_controller.go:65
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:216
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:192
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:171
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88
2020-01-31T04:39:43.525Z	DEBUG	controller-runtime.controller	Successfully Reconciled	{"controller": "iamrole", "request": "default/iamrole-sample-3"}

Bug: IAM Trust Policy is not optional in the CM?

Is this a BUG REPORT or FEATURE REQUEST?:
Bug

What happened:
I don't know when this happened, but my IAM Manager install stopped functioning. I had a very simple configmap that let the controller come up and discover the OIDC Information for the cluster, and I did not hard-code any specific trust policy. It worked.

Then while testing another issue (#73), I discovered that the IAM roles I was creating were failing due to this error:

2021-01-28T14:20:42.394Z	ERROR	internal.utils.utils.GetTrustPolicy	unable to get the trust policy. It must follow v1alpha1.AssumeRolePolicyDocument syntax	{"request_id": "4c0ddbce-f413-4b3a-a80a-5739a0d0dd7a", "error": "default trust policy is not provided in the config map. Request must provide trust policy in the CR"}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
github.com/keikoproj/iam-manager/internal/utils.GetTrustPolicy
	/workspace/internal/utils/utils.go:51
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).ConstructCreateIAMRoleInput
	/workspace/controllers/iamrole_controller.go:284
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).HandleReconcile
	/workspace/controllers/iamrole_controller.go:153
github.com/keikoproj/iam-manager/controllers.(*IamroleReconciler).Reconcile
	/workspace/controllers/iamrole_controller.go:106
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88
I0128 14:20:42.394601       1 event.go:281] Event(v1.ObjectReference{Kind:"Iamrole", Namespace:"observability", Name:"test-role", UID:"09c02398-10f2-4f0f-aacf-6d0ede26839f", APIVersion:"iammanager.keikoproj.io/v1alpha1", ResourceVersion:"47393684", FieldPath:""}): type: 'Warning' reason: 'Error' Unable to create/update iam role due to error default trust policy is not provided in the config map. Request must provide trust policy in the CR

What you expected to happen:

I expected a default/generic trust policy to be used. Even though the IRSA enabled flag was disabled (because we manage our own OIDC settings for the cluster), we did pass in the OIDC URL.. however, this code I think bypasses that. If the IAM Manager cannot construct its own OIDC setup on startup, then it bails out?

It seems like if we supply the OIDC URL - or if one exists and the code can discover it - then we should just use that. Right?

Anything else we need to know?:

Environment:

  • iam-manager version: 0.0.6
  • Kubernetes version : 1.18.8

iam-manager installation with SSL cert management

Is this a BUG REPORT or FEATURE REQUEST?:
Feature Request

What happened:
iam-manager uses validating-webhook to validate the request before persisting to etcd and api server can talk to web hook using only https and thus needs SSL cert management. Kubebuilder automatically assumes that https://github.com/jetstack/cert-manager is installed on the target cluster. Document the ham-manager installation with and without cert-manager on the target cluster.

What you expected to happen:
Documentation should have clear instructions if users doesn't want to use cert-manager on the clusters and wants to manage cert management manually.

Anything else we need to know?:
[] Run the installation with cert-manager
[] Run the installation without cert-manager

Handling Defer, Panic, and Recover for Controller Events

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
If there are issues with the master-role or a wrong spec used to create a iamrole object (like some indentation or PolicyDocument/Statements missing in the spec), Controller pod is unable to handle the exception and transitioning to "crashloopback"

What you expected to happen:
The process should log the event and the pod should remain to handle incoming requests.

How to reproduce it (as minimally and precisely as possible):
Create a IAM role with a wrong spec. For example:

apiVersion: iammanager.keikoproj.io/v1alpha1
kind: Iamrole
metadata:
  name: iamrole-sample-4
spec:
    Statement:
      -
        Action:
          - "sts:Describe"
          - "kms:ListKeys"
          - "kms:PutKeyPolicy"
          - "kms:ListAliases"

Anything else we need to know?:

Adding defer function to Reconcile - Controller will mitigate this issue:

func (r *IamroleReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
	defer func() {
		if err := recover(); err != nil {
			fmt.Println(err)
		}
	}()
	ctx := context.WithValue(context.Background(), requestId, uuid.New())
	log := log.Logger(ctx, "controllers", "iamrole_controller", "Reconcile")
	log.WithValues("iamrole", req.NamespacedName)
	log.Info("Start of the request")
	//Get the IAM specific resource
	var iamRole iammanagerv1alpha1.Iamrole

	if err := r.Get(ctx, req.NamespacedName, &iamRole); err != nil {
		log.Error(err, "unable to get iam resource from api server. ignoring it as the event will be back once its available")
		return ctrl.Result{}, ignoreNotFound(err)
	}

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

ServiceAccounts should be annotated with sts-regional-endpoints for HA

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST
What happened:

HA for IAM: EKS recommends not using the global endpoint (us-east-1) and instead using the regional endpoint. ServiceAccounts should be annotated with eks.amazonaws.com/sts-regional-endpoints: "true", in order to reach out to the regional IRSA, than the global endpoint. Details at
https://github.com/aws/amazon-eks-pod-identity-webhook#aws_sts_regional_endpoints-injection
We need to enable this for all service accounts.

What you expected to happen:
when a ServiceAccounts is annotated by the IAM manager with Role, we should also add eks.amazonaws.com/sts-regional-endpoints: "true"

Allow Default Trust Policy in config map instead of Trust policy ARNs

Is this a BUG REPORT or FEATURE REQUEST?:
Feature

What happened:
At the moment, IAM Manager allows administrators to configure default trust policy with any pre-defined ARN list so if IAMRole CR doesn't have AssumeRolePolicyDocument it will create trust policy with the list defined in config map.

It was fine until we support new Trust policy types for example: AssumeRoleWithWebIdentity. To support all types of trust policy we must accept the entire default Trust policy as config map variable instead of just ARN.
It would be tricky since Trust policy itself is a json but we might need to escape quotes to appropriate encode decode at the server side.

What you expected to happen:
users can add the trust policy of any time and should be considered for default trust policy if the request doesn't have one.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Discrepancy between iam.policy.action.prefix.whitelist and AWS permissions boundary

Use Case:
A service to be deployed into an AWS EKS cluster is required to use IRSA and must explicitly define all AWS actions required to be accessed, wildcard actions are not allowed.

What happened:
Disagreement in configuration between prefix and IamRole configuration resulted in service not being able to access AWS resources while the policy was allowed by the iam_manager.

The prefix list configured contained action ec2:Describe (no wildcard), while the role to be generated included action ec2:DescribeTags, the iam_manager policy check allowed the action due to aforementioned prefix. The problem is that the permission boundary generated in AWS only includes the explicit action ec2:Describe, so while the policy check passes during resource resource validation it later fails AWS authorisation as ec2:DescribeTags is not allowed. I did not notice this discrepancy until I ran the AWS policy simulator which resulted in:

Implicitly denied by a Permissions Boundary Policy (no matching statements)

What you expected to happen:
In the example provided for allowed_policies.txt there are action's ending with wildcards i.e. s3:*. This give the impressions naively that while the documentation says prefix its executing a regular expression check during policy checks.

This discrepancy occurs because there are two point of truth the allowed_policies.txt used for the permission boundary (wildcards allowed) and the config map (no wildcards). Would it not be better to validate actions in the same way as AWS using wildcard globbing, to avoid this confusion.

Environment:

iam-manager version: 0.0.7
Kubernetes version : 1.18.9

Policy Spec

---
apiVersion: iammanager.keikoproj.io/v1alpha1
kind: Iamrole
metadata:
  namespace: services
  name: jdv6e5fpi3m7mxpt
  annotations:
    iam.amazonaws.com/irsa-service-account: jdv6e5fpi3m7mxpt
spec:
  PolicyDocument:
    Statement:
      - Effect: "Allow"
        Action:
          - "ec2:DescribeTags"
        Resource:
          - "*"
        Sid: "ReadEc2Tags"

Clean up the code

Is this a BUG REPORT or FEATURE REQUEST?:

What happened:
POC code has been pushed and needs to clean up some of the mess including log statements and comments section

What you expected to happen:
Code should be cleaner and generic and not specific to any environment.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Managed policy is read incorrectly from config map

Is this a BUG REPORT or FEATURE REQUEST?:
Bug

What happened:
For my usecase, I don't want to provide any managed policy. Hence, I removed field iam.managed.policies from iam-manager config map. But, I am still getting following error.

2020-10-27T22:05:46.575Z ERROR awsapi.iam.CreateRole Error while attaching managed policy {"request_id": "f31f3be2-787c-4a2a-8eba-36e0d3fbef3d", "roleName": "k8s-chaos-ns", "policy": "arn:aws:iam::233444812205:policy/", "error": "InvalidInput: ARN arn:aws:iam::233444812205:policy/ is not valid.\n\tstatus code: 400, request id: 32448561-3ec2-4a78-bb5b-9000cb4ed514"}

Based on golang behaviour, even when managed policy field is empty string ("") in config map. When you do strings.split, it will return list of string having length 1 and first element as empty string "".

managedPolicies := strings.Split(cm[0].Data[propertyManagedPolicies], separator)

https://play.golang.org/p/qazwf1dYDPY
What you expected to happen:
IAM role should create successfully

How to reproduce it (as minimally and precisely as possible):
Remove managed policies field from config map.

Hard-coded account_id might be an issue in multi-account environments

Is this a BUG REPORT or FEATURE REQUEST?:
Feature Request/Question

What happened:
Is a hard-coded account_id needed which might change in a multi-org setting having clusters across accounts. In this case the user has to change the parameter for every cluster and this might not be the case if there is a global pipeline deploying to multiple clusters.

What you expected to happen:
STS caller identity can get the account id based on the master-iam-role. With this there is no need to set the account-id.

How to reproduce it (as minimally and precisely as possible):
Deploy application in a different account than the one mentioned in the configmap. Role creation fails.

Anything else we need to know?:

properties.go:

//LoadProperties function loads properties from various sources
func LoadProperties(ctx context.Context, kClient *k8s.Client, ns string, cmName string) (*Properties, error) {
	log := log.Logger(ctx, "config", "LoadProperties")
	log.WithValues("namespace", ns)
	log.Info("loading properties")
	stsClient := sts.New(session.New())
	callerIdentity, err := stsClient.GetCallerIdentity(&sts.GetCallerIdentityInput{})
	if err != nil {
		log.Info("error getting caller idenity client. err: %v", err)
		return nil, err
	}
	return &Properties{
		AWSAccountId: *callerIdentity.Account,
	}, nil
}

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

MIgrate CI Job to Github Actions from Travis

Is this a BUG REPORT or FEATURE REQUEST?:
Feature request

What happened:
Looks like travis is shutting down travis-ci.org and this is probably the best time to move to GitHub Actions since its promising.

What you expected to happen:
Github Action should take care of build and push

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Allow Managed Policy to attach to Individual Role

Is this a BUG REPORT or FEATURE REQUEST?:
Feature Request

What happened:
iam-manager allows attaching managed iam policies to all the roles and can not attach to individual role. Iam-manager should support attaching managed policy to individual roles as well.

What you expected to happen:
Managed policy can be attached to specific role

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Make webhook validation an optional validation

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
At the moment, webhook validation is mandatory as part of iam-manager and expectation is customers use cert-manager to handle cert so it will be straight forward implementation. Admission control validations comes with its own requirements which limits the iam-manager usage.

What you expected to happen:
We should provide an option to install with or with out web hook so administrators can choose how they want to proceed.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:
Just to re-iterate:

With webhook:

  1. We can detect the custom resource if its using a policy which is not in the whitelisted policies before it gets inserted into etcd.
  2. Cleaner way.

With out webhook:

  1. Controller validates the custom resource as an after the fact and put custom resource is in Invalid state.

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Update a role with Permission Boundary for existing roles

Is this a BUG REPORT or FEATURE REQUEST?:
Feature Request
What happened:
If the role is created by external factors previously and wants to bring that role into Iam-manager scope, we need to allow "Adding/updating a role with Permission Boundary".

What you expected to happen:
If the role is already there, just add the permission boundary else follow the "new role" logic

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

verify iam-manager behavior on cert-renewal by cert-manager

Is this a BUG REPORT or FEATURE REQUEST?:
Feature Request

What happened:
If using cert-manager, verify how would controller react when the cert renewal happens

What you expected to happen:
iam-manager controller should not have any issues for in-flight requests if cert-manager renew the expired cert.

How to reproduce it (as minimally and precisely as possible):
Run the perf test with 3 TPS and force cert-manager to renew the cert and verify none of the requests got failed or document if there are any issues

Feature Request: Customizable IAM Role Name Patterns

Is this a BUG REPORT or FEATURE REQUEST?:

This is a feature request. We plan to implement this ourselves in a PR and will share it back.

What happened:

The current naming pattern of k8s-<resource-name> or k8s-<ns-name> is extremely limited, and causes conflicts if you run multiple clusters. We should be able to customize not only the prefix, but the entire name pattern.

What you expected to happen:

Ideally we can set a property like iam.role.prefix to replace k8s with whatever we want. Additionally, iam.role.pattern should be created so that we can replace the entire role name, and use golang-based templating. Eg: iam.role.pattern: {{ object.metadata.namespace}}-{{object.metadata.name}}-{{object.labels.foo }}

How to reproduce it (as minimally and precisely as possible):

N/A

Anything else we need to know?:

I am planning on working on this code change. I am wondering how active this project is? Will the PR be reviewed quickly and are you actively accepting code changes?

Handle Delete role gracefully

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT

What happened:
As part of the delete role, iam-manager is looking for specific policies to be detached before it deletes a role but gets failed if there is any other policy attached.

What you expected to happen:
We should change this to

  1. List all the policies attached for a particular role
  2. Detach All policies
  3. Delete role

How to reproduce it (as minimally and precisely as possible):
Try to attach a policy outside of iam-manager and delete the role

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Make it "truly" desired state for custom resources

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

What happened:
Iam roles created by iam-manager can be manipulated by directly modifying iam-role through AWS IAM APIs if user has admin access to the cluster account. Though, reconcile process is implemented in idempotent way, there will be still out of sync for certain duration as we have no control over the external changes.

What you expected to happen:
To make it "truly" desired state, it would be great to get the notifications from AWS if there is any change on any roles created by iam-manager other than iam-manager role. We can use the same lambda to monitor any roles tagged by iam-manager and push the event to SQS queue then iam-manager controller can consume that SQS queue and apply/overwrite the changes with whatever user defined as part of custom resource.

How to reproduce it (as minimally and precisely as possible):
Create a role in iam-manager and log into AWS account with administrator access and modify the IAM policy.

Anything else we need to know?:

Environment:

  • iam-manager version
  • Kubernetes version :
$ kubectl version -o yaml

Other debugging information (if applicable):

- controller logs:

$ kubectl logs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.