Git Product home page Git Product logo

Comments (12)

msdisme avatar msdisme commented on August 12, 2024

Is there an issue for tracking the details of what we would like to capture (I'll want to have similar for Decorus) so that I may use that as a basis for discussions with university legal and IRB.

from blueprint.

tumido avatar tumido commented on August 12, 2024

Data we want to collect and share:

  1. Application logs from all the applications running in the cluster, this can be really anything. If the application logs for example which users are connecting to it, we will collect it. Example (ODH JupyterHub):
[I 2021-05-05 10:02:42.206 JupyterHub pages:402] [email protected] is pending spawn
[I 2021-05-05 10:02:42.210 JupyterHub log:189] 200 GET /hub/spawn-pending/[email protected] ([email protected]@::ffff:10.131.0.1) 13.28ms
10:02:47.190 [ConfigProxy] �[32minfo�[39m: 200 GET /api/routes
http://10.131.2.139:9090!=http://10.131.2.139:8080
 2021-05-05 10:03:00.462 JupyterHub proxy:282] Adding user [email protected] to proxy /user/[email protected]/ => http://10.131.3.105:8080
10:03:00.465 [ConfigProxy] �[32minfo�[39m: Adding route /user/[email protected] -> http://10.131.3.105:8080
10:03:00.465 [ConfigProxy] �[32minfo�[39m: Route added /user/[email protected] -> http://10.131.3.105:8080
10:03:00.465 [ConfigProxy] �[32minfo�[39m: 201 POST /api/routes/user/[email protected]
[I 2021-05-05 10:03:00.468 JupyterHub log:189] 200 GET /hub/api (@10.131.3.105) 1.97ms
[I 2021-05-05 10:03:00.469 JupyterHub users:671] Server [email protected] is ready
[I 2021-05-05 10:03:00.471 JupyterHub log:189] 200 GET /hub/api/users/[email protected]/server/progress ([email protected]@::ffff:10.131.0.1) 18057.13ms
[I 2021-05-05 10:03:00.528 JupyterHub log:189] 200 POST /hub/api/users/[email protected]/activity ([email protected]@10.131.3.105) 33.95ms
[I 2021-05-05 10:03:00.613 JupyterHub log:189] 302 GET /hub/spawn-pending/[email protected] -> /user/[email protected]/ ([email protected]@::ffff:10.131.0.1) 6.94ms
[I 2021-05-05 10:03:01.023 JupyterHub log:189] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-tcoufal%2540redhat.com&redirect_uri=%2Fuser%2Ftcoufal%40redhat.com%2Foauth_callback&response_type=code&state=[secret] -> /user/[email protected]/oauth_callback?code=[secret]&state=[secret] ([email protected]@::ffff:10.131.0.1) 34.50ms
[I 2021-05-05 10:03:01.215 JupyterHub log:189] 200 POST /hub/api/oauth2/token ([email protected]@10.131.3.105) 53.71ms
[I 2021-05-05 10:03:01.246 JupyterHub log:189] 200 GET /hub/api/authorizations/token/[secret] ([email protected]@10.131.3.105) 24.52ms
10:03:02.662 [ConfigProxy] �[32minfo�[39m: 200 GET /api/routes
  1. Application metrics if the application exposes them, each application can define what metrics to show. This may include PII, if the username or what not is used to name a pod for example (labels can be anything really). Example (ODH JupyterHub):
# HELP jupyterhub_server_spawn_duration_seconds time taken for server spawning operation
# TYPE jupyterhub_server_spawn_duration_seconds histogram
jupyterhub_server_spawn_duration_seconds_bucket{le="0.5",status="success"} 0.0
jupyterhub_server_spawn_duration_seconds_bucket{le="1.0",status="success"} 0.0
jupyterhub_server_spawn_duration_seconds_bucket{le="2.5",status="success"} 0.0
jupyterhub_server_spawn_duration_seconds_bucket{le="5.0",status="success"} 0.0
jupyterhub_server_spawn_duration_seconds_bucket{le="10.0",status="success"} 1.0
jupyterhub_server_spawn_duration_seconds_bucket{le="15.0",status="success"} 8.0
jupyterhub_server_spawn_duration_seconds_bucket{le="30.0",status="success"} 27.0
jupyterhub_server_spawn_duration_seconds_bucket{le="60.0",status="success"} 42.0
jupyterhub_server_spawn_duration_seconds_bucket{le="120.0",status="success"} 52.0
jupyterhub_server_spawn_duration_seconds_bucket{le="+Inf",status="success"} 57.0
jupyterhub_server_spawn_duration_seconds_count{status="success"} 57.0
jupyterhub_server_spawn_duration_seconds_sum{status="success"} 3389.5434402088904
  1. Platform events - events generated by the OCP platform itself. Example (spawning a pod):
{"apiVersion":"v1","count":1,"eventTime":null,"firstTimestamp":"2021-05-05T10:07:31Z","involvedObject":{"apiVersion":"v1","kind":"Pod","name":"jupyterhub-nb-tcoufal-40redhat-2ecom","namespace":"opf-jupyterhub","resourceVersion":"209536743","uid":"7a27741f-a72d-4f0d-bf17-3cd0d3ede494"},"kind":"Event","lastTimestamp":"2021-05-05T10:07:31Z","message":"Add eth0 [10.131.3.106/23]","metadata":{"creationTimestamp":"2021-05-05T10:07:31Z","managedFields":[{"apiVersion":"v1","fieldsType":"FieldsV1","fieldsV1":{"f:count":{},"f:firstTimestamp":{},"f:involvedObject":{"f:apiVersion":{},"f:kind":{},"f:name":{},"f:namespace":{},"f:resourceVersion":{},"f:uid":{}},"f:lastTimestamp":{},"f:message":{},"f:reason":{},"f:source":{"f:component":{}},"f:type":{}},"manager":"multus","operation":"Update","time":"2021-05-05T10:07:31Z"}],"name":"jupyterhub-nb-tcoufal-40redhat-2ecom.167c23baee4d0e1c","namespace":"opf-jupyterhub","resourceVersion":"209537358","selfLink":"/api/v1/namespaces/opf-jupyterhub/events/jupyterhub-nb-tcoufal-40redhat-2ecom.167c23baee4d0e1c","uid":"40c934ef-4bd2-4f41-8686-e5c979adec62"},"reason":"AddedInterface","reportingComponent":"","reportingInstance":"","source":{"component":"multus"},"type":"Normal"}
{"apiVersion":"v1","count":1,"eventTime":null,"firstTimestamp":"2021-05-05T10:07:32Z","involvedObject":{"apiVersion":"v1","fieldPath":"spec.containers{notebook}","kind":"Pod","name":"jupyterhub-nb-tcoufal-40redhat-2ecom","namespace":"opf-jupyterhub","resourceVersion":"209536741","uid":"7a27741f-a72d-4f0d-bf17-3cd0d3ede494"},"kind":"Event","lastTimestamp":"2021-05-05T10:07:32Z","message":"Container image \"quay.io/thoth-station/s2i-minimal-notebook@sha256:eacfa74842ce6330991d945408bb37c3e8f37246ff3f1b98837cf7ae4f5a78af\" already present on machine","metadata":{"creationTimestamp":"2021-05-05T10:07:32Z","managedFields":[{"apiVersion":"v1","fieldsType":"FieldsV1","fieldsV1":{"f:count":{},"f:firstTimestamp":{},"f:involvedObject":{"f:apiVersion":{},"f:fieldPath":{},"f:kind":{},"f:name":{},"f:namespace":{},"f:resourceVersion":{},"f:uid":{}},"f:lastTimestamp":{},"f:message":{},"f:reason":{},"f:source":{"f:component":{},"f:host":{}},"f:type":{}},"manager":"kubelet","operation":"Update","time":"2021-05-05T10:07:32Z"}],"name":"jupyterhub-nb-tcoufal-40redhat-2ecom.167c23bb0d9cb74e","namespace":"opf-jupyterhub","resourceVersion":"209537393","selfLink":"/api/v1/namespaces/opf-jupyterhub/events/jupyterhub-nb-tcoufal-40redhat-2ecom.167c23bb0d9cb74e","uid":"4677bec6-ee3a-4866-9d0f-b3c3e06f86f6"},"reason":"Pulled","reportingComponent":"","reportingInstance":"","source":{"component":"kubelet","host":"os-wrk-1"},"type":"Normal"}
  1. Platform logs are similar to the application logs, but generated by the OCP platform itself. Example (OAuth logs):
I0427 19:19:02.124608       1 named_certificates.go:53] loaded SNI cert [1/"sni-serving-cert::/var/config/system/secrets/v4-0-config-system-router-certs/apps.zero.massopen.cloud::/var/config/system/secrets/v4-0-config-system-router-certs/apps.zero.massopen.cloud"]: "api.zero.massopen.cloud" [serving,client] validServingFor=[*.apps.zero.massopen.cloud,api.zero.massopen.cloud] issuer="R3" (2021-03-08 12:41:20 +0000 UTC to 2021-06-06 12:41:20 +0000 UTC (now=2021-04-27 19:19:02.124599505 +0000 UTC))
I0427 19:19:02.124830       1 named_certificates.go:53] loaded SNI cert [0/"self-signed loopback"]: "apiserver-loopback-client@1619551141" [serving] validServingFor=[apiserver-loopback-client] issuer="apiserver-loopback-client-ca@1619551141" (2021-04-27 18:19:00 +0000 UTC to 2022-04-27 18:19:00 +0000 UTC (now=2021-04-27 19:19:02.124819977 +0000 UTC))
E0427 19:21:28.160157       1 osinserver.go:91] internal error: system:serviceaccount:opf-jupyterhub:jupyterhub-hub has no redirectURIs; set serviceaccounts.openshift.io/oauth-redirecturi.<some-value>=<redirect> or create a dynamic URI using serviceaccounts.openshift.io/oauth-redirectreference.<some-value>=<reference>
E0427 19:21:28.160157       1 osinserver.go:91] internal error: system:serviceaccount:openshift-logging:kibana has no redirectURIs; set serviceaccounts.openshift.io/oauth-redirecturi.<some-value>=<redirect> or create a dynamic URI using serviceaccounts.openshift.io/oauth-redirectreference.<some-value>=<reference>
E0427 19:21:40.496897       1 osinserver.go:91] internal error: system:serviceaccount:openshift-logging:kibana has no redirectURIs; set serviceaccounts.openshift.io/oauth-redirecturi.<some-value>=<redirect> or create a dynamic URI using serviceaccounts.openshift.io/oauth-redirectreference.<some-value>=<reference>
E0427 19:21:40.496905       1 osinserver.go:91] internal error: system:serviceaccount:opf-jupyterhub:jupyterhub-hub has no redirectURIs; set serviceaccounts.openshift.io/oauth-redirecturi.<some-value>=<redirect> or create a dynamic URI using serviceaccounts.openshift.io/oauth-redirectreference.<some-value>=<reference>
E0427 19:21:46.638010       1 osinserver.go:91] internal error: system:serviceaccount:opf-jupyterhub:jupyterhub-hub has no redirectURIs; set serviceaccounts.openshift.io/oauth-redirecturi.<some-value>=<redirect> or create a dynamic URI using serviceaccounts.openshift.io/oauth-redirectreference.<some-value>=<reference>
E0428 14:28:36.180088       1 osinserver.go:91] internal error: system:serviceaccount:opf-monitoring:grafana-serviceaccount has no redirectURIs; set serviceaccounts.openshift.io/oauth-redirecturi.<some-value>=<redirect> or create a dynamic URI using serviceaccounts.openshift.io/oauth-redirectreference.<some-value>=<reference>
E0503 19:37:02.866939       1 authentication.go:53] Unable to authenticate the request due to an error: [invalid bearer token, context canceled]
  1. Platform metrics - same data structure as the application metrics, but generated by the OCP itself. Sample of kube pod info metric:

kube_pod_info{container="kube-rbac-proxy-main", created_by_kind="<none>", created_by_name="<none>", endpoint="https-main", host_ip="192.12.185.110", job="kube-state-metrics", namespace="openshift-etcd", node="os-ctrl-0", pod="revision-pruner-5-os-ctrl-0", pod_ip="10.130.0.7", priority_class="system-node-critical", service="kube-state-metrics", uid="c5a1bc73-f28b-4e0f-9cc6-c2c7abd5b0b8"} 1
kube_pod_info{container="kube-rbac-proxy-main", created_by_kind="<none>", created_by_name="<none>", endpoint="https-main", host_ip="192.12.185.110", job="kube-state-metrics", namespace="openshift-kube-apiserver", node="os-ctrl-0", pod="revision-pruner-22-os-ctrl-0", pod_ip="10.130.0.139", priority_class="system-node-critical", service="kube-state-metrics", uid="721d7288-3b7a-4460-be92-bea36e3539fa"} 1
kube_pod_info{container="kube-rbac-proxy-main", created_by_kind="<none>", created_by_name="<none>", endpoint="https-main", host_ip="192.12.185.110", job="kube-state-metrics", namespace="openshift-kube-controller-manager", node="os-ctrl-0", pod="revision-pruner-12-os-ctrl-0", pod_ip="10.130.0.141", priority_class="system-node-critical", service="kube-state-metrics", uid="c92a8983-6ba6-42ed-af6f-535aed848e67"} 1
kube_pod_info{container="kube-rbac-proxy-main", created_by_kind="<none>", created_by_name="<none>", endpoint="https-main", host_ip="192.12.185.110", job="kube-state-metrics", namespace="openshift-kube-scheduler", node="os-ctrl-0", pod="revision-pruner-11-os-ctrl-0", pod_ip="10.130.0.140", priority_class="system-node-critical", service="kube-state-metrics", uid="e799030c-703f-4654-b896-8493f3e2dd35"} 1
kube_pod_info{container="kube-rbac-proxy-main", created_by_kind="<none>", created_by_name="<none>", endpoint="https-main", host_ip="192.12.185.111", job="kube-state-metrics", namespace="openshift-etcd", node="os-ctrl-1", pod="revision-pruner-5-os-ctrl-1", pod_ip="10.128.0.121", priority_class="system-node-critical", service="kube-state-metrics", uid="e8df07cc-9ad9-4479-8922-556b2a1cc2ae"} 1
  1. We are also collecting data derived from it, like alerts which are directly calculated from metrics, e.g.: operate-first/alerts#5609

Data we are hosting for users and their applications. We're not collecting the data intentionally, but they can share via our platform:

  • We provide block storage which is used by applications and users to store data. Direct access to this block storage is available within the platform only and data can be retrieved only via proxy (the application mounting the storage itself).
  • We provide object storage that can be interfaced externally - users can access this data from outside of the platform if they have credentials to their object storage bucket.

from blueprint.

msdisme avatar msdisme commented on August 12, 2024

Thanks, this is great! should I break the details in the comment above into a different issue or does it make sense for them to live here?

from blueprint.

msdisme avatar msdisme commented on August 12, 2024

a quick update, met with the folks who review IRB - scheduling a follow up discussion with them to dive deeper in to the data.

from blueprint.

durandom avatar durandom commented on August 12, 2024

Operational data specifically excludes users own data sets, i.e. it's only data that is generated by the platform: logs, metrics, telemetry.
For logs it excludes logs from the workloads pods, but includes logs from the platform pods. E.g. JupyterHub vs etcd
For metrics it'll include CPU metrics for workloads pods, but not metrics that the application exposes. E.g. JupyterHub metrcs vs Pod metric

The same definition can be made for workload data, which should be governed by an opt-in or opt-out policy - see #87

from blueprint.

billburnseh avatar billburnseh commented on August 12, 2024

No updates from BU yet.

from blueprint.

billburnseh avatar billburnseh commented on August 12, 2024

The Data Usage Agreement (DUA) is on the table and being discussed, including access to telemetry without anonymization.

from blueprint.

quaid avatar quaid commented on August 12, 2024

We want to publish the data under a license agreement that is similar to an open source license agreement.
We still have to operate in the boundaries of law and therefore cannot publish data that would break the law.

Let's pull together a workstream to study and advise an approach from an open source licensing approach:

operate-first/community#79

from blueprint.

sesheta avatar sesheta commented on August 12, 2024

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

from blueprint.

sesheta avatar sesheta commented on August 12, 2024

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

from blueprint.

sesheta avatar sesheta commented on August 12, 2024

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

from blueprint.

sesheta avatar sesheta commented on August 12, 2024

@sesheta: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from blueprint.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.