Git Product home page Git Product logo

Comments (7)

jbw976 avatar jbw976 commented on August 14, 2024

it is somewhat unlikely that this scenario will be exercised that frequently

Could you clarify on how the scenario described in this issue differs from the mainline scenarios that our user guides walk people through?

from provider-gcp.

muvaf avatar muvaf commented on August 14, 2024

Could you clarify on how the scenario described in this issue differs from the mainline scenarios that our user guides walk people through?

As long as you allow Connection resource to be created before creating the CloudSQL resource, it's fine. Before, network resources were created separately from the claims, so, there was a short period in-between and we didn't see that problem. But now if you just do kubectl apply -f <dir> and that dir includes both MySQLInstance and Connection resources, you might end up in this situation.

There are two things here:

  • For whatever reason, GCP API doesn't return an error for Insert call in this case but when you go to console, you'd see the error given above and creation failure. This needs some more investigation as there might be some classes of errors that we have to account for specifically.
  • GCP explicitly tells you Please create Service Networking connection with service 'servicenetworking.googleapis.com' from consumer project '283222062215' network 'argo' again. but doing that doesn't help. CloudSQL doesn't reconcile on its own even if you create this like they suggest. You have to delete the CloudSQL instance and create a new one.

For the first problem, we probably need more investigation of CloudSQL API. It might be the case that this is not treated as error but rather a warning. Nevertheless, our CR should show it.

For the second one, what we could do is to have a reference only to block the creation. For instance, CloudSQL would have a connectionRef but this would be used only for waiting the referred object to be created, not really resolve to a value to be used in CloudSQL spec. cc'ing @negz

from provider-gcp.

hasheddan avatar hasheddan commented on August 14, 2024

For the second one, what we could do is to have a reference only to block the creation.

This was my initial thought was well. However, after talking with @negz he expressed some concerns around building a dependency graph a an eventually consistent system. We may have to get creative with how we want to solve this issue, and maybe the first pass does just look like a connectionRef that resolves without setting a configuration value in the CloudSQLInstance. These kind of scenarios make me glad that we now have our forProvider section of the spec because I like the thought of having "depends on" references in the spec but not in forProvider.

But now if you just do kubectl apply -f < dir > and that dir includes both MySQLInstance and Connection resources, you might end up in this situation.

@jbw976 To expound on what @muvaf said here, the creation in the Services guides is sequential so this is not an issue, and in the Stacks guide I am guessing this is not an issue because the CloudSQLInstance is waiting for the Network to resolve with this reference, and it must be the common case that this introduces enough of a delay that the Connection is ready.

from provider-gcp.

negz avatar negz commented on August 14, 2024

it is somewhat unlikely that this scenario will be exercised that frequently

Could you clarify on how the scenario described in this issue differs from the mainline scenarios that our user guides walk people through?

@jbw976 Frankly I'm pretty skeptical that the scenario our guides walk folks through is a use case that exists in the real world. To clarify:

  • I believe using GitOps to manage network infrastructure is a real scenario
  • I believe using GitOps to manage CloudSQL resources is a real scenario
  • I believe the way our guides frame all of this (create everything at once) is a powerful demonstration

...but I strongly suspect folks wanting to create a VPC (typically the base network primitive that all other infrastructure lives in) in the exact same step as the infrastructure that runs in that VPC is a rare case, at best. It's certainly not a pattern I have ever encountered during my time in SRE. More commonly I would imagine creating the network infrastructure (a rare event) and creating infrastructure like databases that live in that network (a much more common event) would be distinct actions and thus either distinct stages in a GitOps pipeline, or distinct pipelines altogether.

I do think it's worth looking for a fix for this issue, but I strongly suggest that we weigh the complexity and effort required to fix it against how potentially serious this shortcoming is to folks trying to use Crossplane in the real world.

from provider-gcp.

negz avatar negz commented on August 14, 2024

So two options come to mind here. Neither would bulletproof this experience, but both would reduce the the likelihood of a user encountering it. If a CloudSQLInstance references a Network, the network isn't "ready" as far as the CloudSQLInstance is concerned unless it has an active service connection. So:

  1. We can infer that a service connection exists by inspecting the Network managed resource's peerings, which are already reflected in its status (see below). So the GetStatus phase of reference resolution could block and only consider the Network ready when its Ready condition is True and it has a peering named servicenetworking-googleapis-com.
  2. We can determine that a Crossplane-managed service connection exists by (again during GetStatus ) listing all Connection resources (at most one exists per Network so I can't imagine there would ever be many) and determining whether any of them reference the Network that the CloudSQLInstance uses.
status:
    peerings:
    - autoCreateRoutes: true
      exchangeSubnetRoutes: true
      name: cloudsql-mysql-googleapis-com
      network: https://www.googleapis.com/compute/v1/projects/speckle-umbrella-30/global/networks/cloud-sql-network-283222062215-b2a39e7ea055b996
      state: ACTIVE
      stateDetails: '[2019-11-06T13:04:01.073-08:00]: Connected.'
    - autoCreateRoutes: true
      exchangeSubnetRoutes: true
      name: servicenetworking-googleapis-com
      network: https://www.googleapis.com/compute/v1/projects/c064626fb0df236dc-tp/global/networks/servicenetworking
      state: ACTIVE
      stateDetails: '[2019-11-06T13:03:51.961-08:00]: Connected.'
    routingConfig:
      routingMode: REGIONAL
    selfLink: https://www.googleapis.com/compute/v1/projects/REDACTED/global/networks/my-cool-network

My inclination is to go with option 1, because it's easier to do, and better supports the case in which you manage your Network with Crossplane, but not your service networking connection.

from provider-gcp.

muvaf avatar muvaf commented on August 14, 2024

the GetStatus phase of reference resolution could block and only consider the Network ready when its Ready condition is True and it has a peering named servicenetworking-googleapis-com.

Option 1 makes sense to me, too. If the name is constant, having it show up on the CR doesn't make a lot of sense. So, I like that option more than the one that adds a reference just to block. Handling the dependency implicitly in the NetworkURIReferencerForCloudSQLInstance sounds right.

Though this is still a bug on GCP side. They say try creating connection and we eventually do but CloudSQL doesn't reconcile and it doesn't return an error on creation. Hope there are not a class of errors that show up only in the console like this one.

from provider-gcp.

negz avatar negz commented on August 14, 2024

Hope there are not a class of errors that show up only in the console like this one.

FWIW I think this does show up on the API too - at least the CloudSQL instance transitions to state FAILED and remains there.

from provider-gcp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.