Git Product home page Git Product logo

Comments (17)

mrbobbytables avatar mrbobbytables commented on July 28, 2024 2

/committee steering
as this has budget / cncf related items

from registry.k8s.io.

jeefy avatar jeefy commented on July 28, 2024

Jeefy lizard brain policy idea:

Non-Prod images: Age out after 9 months
Prod images: Age out after release-EOL + 1y (so if 1.27 aged out Jan 2024, it would get culled Jan 2025)

from registry.k8s.io.

mrbobbytables avatar mrbobbytables commented on July 28, 2024

I am broadly in support of this. I don't think its a reasonable ask for the project to host all releases for all time.
Do we have any data on which versions are being pulled?
I know there are 3rd party reports available (e.g. datadog report) that show 1.21 is still the most common version deployed right now, I would want to make sure we take that into account.

Right now I'm leaning towards EOL-9 (3 years), but would want some data before making any decision.

EDIT: Even without data I think we should remove the google-containers images...oof

from registry.k8s.io.

ameukam avatar ameukam commented on July 28, 2024

/sig testing
/kind cleanup

from registry.k8s.io.

BenTheElder avatar BenTheElder commented on July 28, 2024

Some additional points:

  • We have a lot of images that are not related to Kubernetes releases, but may have their own release timelines. I think we should maybe just set an N year policy, where N is relatively generous but still gives us room to stop perma-hosting.

  • The mechanism to remove images needs some thinking ...

    • The source of truth on what images are available is in the https://github.com/kubernetes/k8s.io repo, and is https://github.com/kubernetes-sigs/promo-tools copying to the backing registries (so to be clear no changes will happen in this repo, it's a multi-repo problem and I figured visibility might be best here, but we need to forward this to more people).
    • We might have to cull them from the image promoter manifests somehow, and then start to permit auto-pruning things that are in production storage but not in the manifests. Automating the policy to drop things from the source manifests sounds tricky, but I'm not sure there's a more reasonable approach. The implementation details need input from promo-tools maintainers and will probably at least somewhat drive the viable policy. cc @puerco @kubernetes/release-engineering (EDIT: see kubernetes-sigs/promo-tools#719)
  • IMHO Despite needing to consider the mechanics, we should decide if we're doing this and on a reasonable policy and starting communication ahead of actually implementing, it may take time to staff these changes but communicating sooner alongside the new registry would be beneficial to users.

EDIT: Even without data I think we should remove the google-containers images...oof

Yeah. That also just hasn't happened due to lack of policy. We did the flip to k8s.gcr.io July 24th 2020. kubernetes/release#270 (comment)

/sig testing

While I'm sure sig-testing is happy to support this effort, I'd suggest that the policy is a combination of k8s-infra (what is k8s-infra willing to fund resources for generally) and release (particularly around the promo tools support for this and kubernetes release timeframe).

/sig k8s-infra
/sig release
/remove-sig testing

from registry.k8s.io.

sftim avatar sftim commented on July 28, 2024

Technical aside: when we serve layer redirects, we then get an option to serve a Warning header alongside the redirect.

from registry.k8s.io.

BenTheElder avatar BenTheElder commented on July 28, 2024

One additional complication for people to consider: We more commonly host our own base images at this point, building old commits from source will become more challenging (but not necessarily impossible*) if we age out those images.

* non-reproducible builds may be an issue, e.g. the "debian-base" image.

from registry.k8s.io.

justinsb avatar justinsb commented on July 28, 2024

Posted this on the linked issue, but maybe it's better here:

The cost reduction would primarily be because we would break people, and encourage them to upgrade (?)

I think other OSS projects follow a similar strategy, e.g. the "normal" debian APT repos don't work with old releases, but there is a public archive.

I don't know the actual strategy for when debian distros move to the archive. For kubernetes, if we support 4 versions (?), we probably want to keep at least 5 versions "live" so that people aren't forced to upgrade all their EOL clusters at once, but we probably want to keep no more than 8 versions "live" so that people are nudged to upgrade. So I come up with a range of 5-8 releases if we wanted to split off old releases, and I can imagine a case for any of those values.

from registry.k8s.io.

xmudrii avatar xmudrii commented on July 28, 2024

For kubernetes, if we support 4 versions (?), we probably want to keep at least 5 versions "live" so that people aren't forced to upgrade all their EOL clusters at once, but we probably want to keep no more than 8 versions "live" so that people are nudged to upgrade.

This is bringing up a very good point. We have to keep in mind that you can't skip Kubernetes versions when upgrading, otherwise you'll go against the skew policy and eventually break the cluster. Let's say that we remove images for up until 1.22, but someone is using 1.20. They don't have a way to upgrade their cluster to a newer version because they have to start with upgrading to 1.21 which wouldn't exist any longer. This is a very problematic scenario because the only way is more-or-less to start from scratch and that's unacceptable for many.

We need to be a bit generous here. I agree with @justinsb that we should target 5-8 release. I'd probably go closer to 8.

from registry.k8s.io.

BenTheElder avatar BenTheElder commented on July 28, 2024

These are good points.

I think we should probably be implementing policy in terms of time though, both to be more managable to implement and because we have many images that are not part of Kubernetes releases.

If we consider the lifespan of releases but put it in terms of time we could say e.g. "3 years after publication" which would be 5-8 releases (since releases are every 1/3 year and supported for one year).

from registry.k8s.io.

jeremyrickard avatar jeremyrickard commented on July 28, 2024

I think we should set a more aggressive timeline going forward, say starting with 1.27 we'll host artifacts for 1 year after things after they hit end of support. I'm not sure how we best handle older things, but things like 1.21 are still being heavily used. If we said "5" releases, that would still fall into that window pretty soon.

If we have to chose between the health of the project overall (CI infra, etc) and impacting people with those older versions, I think we have to unfortunately chose the health of the project :( Can we provide some extraordinary mechanisms for people to pull maybe tarballs of older stuff and some instructions on how to put those into a registry, like some sort of cold storage for folks?

from registry.k8s.io.

dims avatar dims commented on July 28, 2024

💯 to say starting with 1.27 we'll host artifacts for 1 year after things after they hit end of support.

from registry.k8s.io.

xmudrii avatar xmudrii commented on July 28, 2024

Can we provide some extraordinary mechanisms for people to pull maybe tarballs of older stuff and some instructions on how to put those into a registry, like some sort of cold storage for folks?

What if we keep the latest patch release for each minor? For example, 1 year after reaching the EOL date, we keep images for the latest patch release, but delete all other images. Eventually, we can remove the latest patch release, let's say, 3-4 years after the EOL date. That will reduce storage and (hopefully) bandwidth costs, but at the same time, it shouldn't break any clusters or workflows.

from registry.k8s.io.

aojea avatar aojea commented on July 28, 2024

The source code will remain https://github.com/kubernetes/kubernetes/releases , is not that people that needs can not do make release ;)

from registry.k8s.io.

jberkus avatar jberkus commented on July 28, 2024

Technical capabilities aside, the ideal set would be, IMHO:

  • Everything for the last 4 releases
  • Just the final patch release for the 4 releases before that, just to enable upgrading

If we had the ability, I expect that there are probably lots of other images we could purge for older releases from subprojects, etc, which could be removed much more aggresively.

from registry.k8s.io.

BenTheElder avatar BenTheElder commented on July 28, 2024

Thanks all for all the input and suggestions!


The source code will remain https://github.com/kubernetes/kubernetes/releases , is not that people that needs can not do make release ;)

That's not sufficient, because the base images cannot just be re-built from source typically, and the source code references base images. Those base images are a snapshot of various packages at a point in time.

We also run binary builds in containers sometimes (see kubernetes/kubernetes), where the build-environment image is also an important point-in-time snapshot for reproducing a source build.


We've been discussing this further in various forums, and on further review I think the cost angle is going to be near totally irrelevant once we have traffic off k8s.gcr.io onto registry.k8s.io.

Storage costs are not large enough to matter. Image storage de-dupes pretty well, it's compressed, in real numbers we're actually looking at less than 2 TB currently even after ~8 years. 2 TB costs like $50/mo to store in GCS standard tier for exxample.

Bandwidth costs matter, deeply, ... but only for content people actually use, and they're going to be a lot more manageable on registry.k8s.io (e.g. >50% of requests to k8s.gcr.io came from AWS, and we're now serving content out of AWS ... no egress).

For the tooling "needs to iterate all of this" angle, I've been working on a new tool that needs to scan all the images over the past few days, and it's actually quite doable to optimize this even as images grow. AR provides a cheap API to list all digests in a repo. I think we can teach the promoter tools to avoid repeatedly fetching content-addressable data we've already processed.


I think disappearing base images in particular is going to cause more harm than benefit ...

We're also really ramping up having users migrate, so I think we are missing the window on "lets get a policy in place before people use it", and it we've forced users to adapt to a new registry twice in the past few years, so I think we can just introduce a policy later if we wind up needing it.

@mrbobbytables outreach to subprojects and other CNCF projects to migrate, and other pending cost optimization efforts are probably a better use of time at the moment.

from registry.k8s.io.

BenTheElder avatar BenTheElder commented on July 28, 2024

There are other issues tracking things like potentially sunsetting k8s.gcr.io, they're tracked in the general k8s-infra repo at https://github.com/kubernetes/k8s.io

from registry.k8s.io.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.