Git Product home page Git Product logo

vitess-operator's People

Contributors

carsonoid avatar derekperkins avatar enisoc avatar sougou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vitess-operator's Issues

Enable custom flags

It should be possible to enable cascading flags for all of the Vitess binaries. Here is an example of how that looks in the current helm charts:

vttablet:

  # Additional flags that will be appended to the vttablet command.
  # The options below are the most commonly adjusted, but any flag can be put here.
  # run vttablet --help to see all available flags
  extraFlags:
    # query server max result size, maximum number of rows allowed to return
    # from vttablet for non-streaming queries.
    queryserver-config-max-result-size: 10000

    # query server query timeout (in seconds), this is the query timeout in vttablet side.
    # If a query takes more than this timeout, it will be killed.
    queryserver-config-query-timeout: 30

https://github.com/vitessio/vitess/blob/master/helm/vitess/values.yaml#L260-L291

Those should be able to be set globally for vttablet, vtgate, etc, with the option to override them at all subsequent levels - keyspace, shard, individual tablet. These flags should be unique, so that more specific values override the more general values.

Add new updateStrategy for "MasterLast"

Using RollingUpdate on a StatefulSet almost guarantees that there will be at least 2 PlannedReparentShard commands run, and in the worst case scenario where PlannedReparentShard keeps selecting n-1, you could have N reparents for N replicas. It would be much preferable to apply updates to all the slaves first, and only then update the master, guaranteeing a maximum of one reparent.

deploying a cluster fails

I am unable to deploy a vitess cluster on my bare metal k8s 14.1 cluster.

kubectl apply -R -f deploy
kubectl apply -f my-vitess.yaml

kubectl get pods -w
NAME READY STATUS RESTARTS AGE
vitess-operator-594f878667-gzs5w 1/1 Running 0 16m
vt-zone1-unsharded-dbname-0-replica-0 0/6 Init:CrashLoopBackOff 7 15m
vt-zone1-unsharded-dbname-0-replica-1 0/6 Init:CrashLoopBackOff 7 15m
vt-zone1-vtctld-5c8cb6f797-sh4vz 0/1 CrashLoopBackOff 7 15m
vt-zone1-vtgate-958f4b894-8lgbb 0/1 CrashLoopBackOff 7 15m
vt-zone1-vtgate-958f4b894-wjkh4 0/1 CrashLoopBackOff 7 15m
wordpress-operator-664b87b9c9-9j8r6 1/1 Running 0 43m

kubectl log vitess-operator-594f878667-gzs5w 100's of lines containing the following...
E0429 09:40:53.006233 1 reflector.go:134] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:196: Failed to list *v1.Job: jobs.batch is forbidden: User "system:serviceaccount:default:vitess-operator" cannot list resource "jobs" in API group "batch" in the namespace "default"

kubectl get jobs
No resources found.

kubectl get vitessclusters -o 'custom-columns=NAME:.metadata.name,READY:.status.phase'
NAME READY
vt

This is my first time trying to deploy Vitess so I am unsure what to try next.

status

What is the current production readiness status of the operator? Is it ok to use? Is upgrades supported? Is the api stable?

Try to keep master tablets on separate nodes

In most deployments, master tablets will see much higher resource consumption on cpu and network usage, and RAM to a lesser extent. This can't be handled with standard affinity/anti-affinity since reparenting happens without any impact on k8s labels. I would love for the operator to monitor how many master tablets live on each node and try to keep things balanced.

The least intrusive way would be to just reparent tablets when certain nodes have too many, though this could cause a cascading effect.

More intrusive would be to actually reschedule pods to different nodes. This would give you more control, but would involve more disk attaching/detaching, and in the case of local ssd mounts, full restores of data.

Endless loop at “init-replica-master” job

I use following VitessCluster CRD, I specify "spec.keyspaces[].spec.shards[].spec.defaults.replicas" to 5.

## Sample VitessCluster all-in-one resource
apiVersion: vitess.io/v1alpha2
kind: VitessCluster
metadata:
  name: aio
  labels:
    app: vitess
spec:
  lockserver:
    metadata:
      name: global
    spec:
      type: etcd2
      etcd2:
        address: etcd-global-client:2379
        pathPrefix: /vitess/global
  cells:
  - metadata:
      name: zone1
    spec:
      lockserver:
        metadata:
          name: zone1
        spec:
          type: etcd2
          etcd2:
            address: etcd-zone1-client:2379
            pathPrefix: /vitess/zone1
      defaults:
        replicas: 1
        image: vitess/vttablet:helm-1.0.6
  keyspaces:
  - metadata:
      name: sharded-dbname
    spec:
      shards:
      - metadata:
          name: "x-80"
        spec:
          keyRange: { to: "80" }
          defaults:
            replicas: 2
            containers:
              mysql:
                image: percona:5.7.23
              vttablet:
                image: vitess/vttablet:helm-1.0.6
          tablets:
          - metadata:
              name: zone1
            spec:
              cellID: zone1
              tabletID: 102
              type: replica
      - metadata:
          name: "80-x"
        spec:
          keyRange: { from: "80" }
          defaults:
            replicas: 5
            containers:
              mysql:
                image: percona:5.7.23
              vttablet:
                image: vitess/vttablet:helm-1.0.6
          tablets:
          - metadata:
              name: zone1
            spec:
              cellID: zone1
              tabletID: 103
              type: replica

But "aio-zone1-sharded-dbname-80-x-init-replica-master" job is not completed, job's logs said following.

++ echo 'zone1-1676054800 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:3306 []
zone1-1676054801 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:3306 []
zone1-1676054802 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:3306 []
zone1-1676054803 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:3306 []
zone1-1676054804 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:3306 []'
++ awk '{print $1}'
+ tabletCount=5
+ '[' 5 == 2 ']'
+ ((  115 > 600  ))
+ sleep 5
+ '[' ']'
++ vtctlclient -server aio-zone1-vtctld.default:15999 ListAllTablets zone1
+ cellTablets='zone1-0638444000 sharded-dbname -80 master aio-zone1-sharded-dbname-x-80-replica-0.aio-tab:15002 aio-zone1-sharded-dbname-x-80-replica-0.aio-tab:3306 []
zone1-0638444001 sharded-dbname -80 replica aio-zone1-sharded-dbname-x-80-replica-1.aio-tab:15002 aio-zone1-sharded-dbname-x-80-replica-1.aio-tab:3306 []
zone1-1676054800 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:3306 []
zone1-1676054801 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:3306 []
zone1-1676054802 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:3306 []
zone1-1676054803 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:3306 []
zone1-1676054804 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:3306 []'
++ echo 'zone1-0638444000 sharded-dbname -80 master aio-zone1-sharded-dbname-x-80-replica-0.aio-tab:15002 aio-zone1-sharded-dbname-x-80-replica-0.aio-tab:3306 []
zone1-0638444001 sharded-dbname -80 replica aio-zone1-sharded-dbname-x-80-replica-1.aio-tab:15002 aio-zone1-sharded-dbname-x-80-replica-1.aio-tab:3306 []
zone1-1676054800 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:3306 []
zone1-1676054801 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:3306 []
zone1-1676054802 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:3306 []
zone1-1676054803 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:3306 []
zone1-1676054804 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:3306 []'
++ grep -w aio-zone1-sharded-dbname-80-x
+ shardTablets='zone1-1676054800 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:3306 []
zone1-1676054801 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:3306 []
zone1-1676054802 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:3306 []
zone1-1676054803 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:3306 []
zone1-1676054804 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:3306 []'
++ echo 'zone1-1676054800 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:3306 []
zone1-1676054801 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:3306 []
zone1-1676054802 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:3306 []
zone1-1676054803 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:3306 []
zone1-1676054804 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:3306 []'
++ awk '$4 == "master" {print $1}'
+ masterTablet=
+ '[' ']'
++ jq .master_alias.uid
++ vtctlclient -server aio-zone1-vtctld.default:15999 GetShard sharded-dbname/80-
+ master_alias=null
+ '[' null '!=' null -a null '!=' '' ']'
++ wc
++ echo 'zone1-1676054800 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-0.aio-tab:3306 []
zone1-1676054801 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-1.aio-tab:3306 []
zone1-1676054802 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-2.aio-tab:3306 []
zone1-1676054803 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-3.aio-tab:3306 []
zone1-1676054804 sharded-dbname 80- replica aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:15002 aio-zone1-sharded-dbname-80-x-replica-4.aio-tab:3306 []'
++ awk '{print $1}'
+ tabletCount=5
+ '[' 5 == 2 ']'
+ ((  120 > 600  ))
+ sleep 5

Archive this project

Per vitessio/vitess#5713 we plan to archive this repository.

This issue is to track:

  • Updating the readme pointing to Vitess helm charts as an upgrade path.
  • Archiving this project.

Deploying a cluster fails because shard don't elect master

I'm trying to test the operator by deploing the my-vitess.yaml. My shards pods are running with only 5/6 containers ready on it, the jobs "init-replica-master" are never finished, the pods "init-replica-master" disappear at some point and the pods shard are still running while there is an event of Readiness prob failed.

Here is the log for one of the pod "init-replica-master":
master

Here is the log for one of the shard pod on the conatiner vttablet:
shard

Here is the events of one of the pod shard:
eve

Here is the state of my kubernetes :
getall

Thanks for the help.

Add Ingress support

Support the creation of ingress objects for:

  • vtctld
  • vtgate
  • orchestrator
  • pmm

I've also thought about adding headless services and trying to route external dns (probably wildcard) entries to specific tablets/gates. e.g. vttabletabc.vttablet.vitess.myapp.com

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.