Comments (17)
Hey @Andrzej9729 ! thanks for filing this issue.
Are you able to resolve this DNS inside the cluster?
nslookup test.namespace.svc.example.apl.pl
You might need to tweak the clusterName
value, which is used by the operator to build the FQDN to connect to your MariaDB
instances.
from mariadb-operator.
Thanks for your answer.
So:
pod/test-0 1/1 Running 0 16h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/test ClusterIP 10.233.33.144 <none> 3306/TCP 16h
NAME READY AGE
statefulset.apps/test 1/1 16h
nslookup test.namespace.svc.example.apl.pl
Server: 127.0.0.53
Address: 127.0.0.53#53
** server can't find test.namespace.svc.example.apl.pl: NXDOMAIN
clusterName: cluster.local
Here's what the command returns, the cluster is set up locally. I've been struggling with this for a while now and I'm out of ideas.
from mariadb-operator.
Hey @Andrzej9729 ! thanks for the reply
nslookup test.namespace.svc.example.apl.pl
Server: 127.0.0.53
Address: 127.0.0.53#53
** server can't find test.namespace.svc.example.apl.pl: NXDOMAIN
Right, so this in an invalid DNS for your cluster, I wonder where the example.apl.pl
comes from.
clusterName: cluster.local
This should be setting the CLUSTER_NAME
environment variable in your operator, which later on is used on this function:
https://github.com/mariadb-operator/mariadb-operator/blob/main/pkg/statefulset/statefulset.go#L12
Are you by any chance setting the variable CLUSTER_NAME
to example.apl.pl
in your mariadb-operator
Pod
?
from mariadb-operator.
Hey @mmontes11, thanks for the reply
In my config.yaml.j2
, I have these settings.
clusterName: {{ mariadb_operator_cluster_domain_name }}
where mariadb_operator_cluster_domain_name="cluster.local"
In the global settings, it was initially set to example.apl.pl, but even after changing it to cluster.local, it didn't make any difference.
from mariadb-operator.
Hey @mmontes11 ! I think you may need to restart the operator Pods after having set mariadb_operator_cluster_domain_name="cluster.local"
, which seems to be the right cluster name
from mariadb-operator.
Yes, I restarted the operator and re-launched the pods after setting it to cluster.local, but it didn't have any effect. I also have an issue with galera:
kubectl -n example get all
NAME READY STATUS RESTARTS AGE
pod/mariadb-galera-0 2/2 Running 0 100s
pod/mariadb-galera-1 1/2 Error 1 (3s ago) 99s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mariadb-galera ClusterIP 10.233.47.69 <none> 3306/TCP 100s
service/mariadb-galera-internal ClusterIP None <none> 3306/TCP,4444/TCP,4567/TCP,4568/TCP,5555/TCP 100s
NAME READY AGE
statefulset.apps/mariadb-galera 1/2 100s
Logs from mariadb-galera-1
kubectl -n example logs mariadb-galera-1
Defaulted container "mariadb" out of: mariadb, agent, init (init)
2023-08-16 10:30:45+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:11.0.3+maria~ubu2204 started.
2023-08-16 10:30:45+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
2023-08-16 10:30:45+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:11.0.3+maria~ubu2204 started.
2023-08-16 10:30:45+00:00 [Note] [Entrypoint]: MariaDB upgrade not required
2023-08-16 10:30:45 0 [Note] Starting MariaDB 11.0.3-MariaDB-1:11.0.3+maria~ubu2204 source revision 70905bcb9059dcc40db3b73bc46a36c7d40f1e10 as process 1
2023-08-16 10:30:45 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1
2023-08-16 10:30:45 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
2023-08-16 10:30:45 0 [Note] WSREP: wsrep_load(): Galera 26.4.14(r06a0c285) by Codership Oy <[email protected]> loaded successfully.
2023-08-16 10:30:45 0 [Note] WSREP: Initializing allowlist service v1
2023-08-16 10:30:45 0 [Note] WSREP: Initializing event service v1
2023-08-16 10:30:45 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
2023-08-16 10:30:45 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
2023-08-16 10:30:45 0 [Note] WSREP: GCache DEBUG: opened preamble:
Version: 2
UUID: 00000000-0000-0000-0000-000000000000
Seqno: -1 - -1
Offset: -1
Synced: 1
2023-08-16 10:30:45 0 [Note] WSREP: Skipped GCache ring buffer recovery: could not determine history UUID.
2023-08-16 10:30:45 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = mariadb-galera-1.mariadb-galera-internal.example.svc.cluster.local; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.keep_plaintext_size = 128M; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.fc_single_primary = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_
2023-08-16 10:30:45 0 [Note] WSREP: Start replication
2023-08-16 10:30:45 0 [Note] WSREP: Connecting with bootstrap option: 0
2023-08-16 10:30:45 0 [Note] WSREP: Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
2023-08-16 10:30:45 0 [Note] WSREP: Using CRC-32C for message checksums.
2023-08-16 10:30:45 0 [Note] WSREP: backend: asio
2023-08-16 10:30:45 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
2023-08-16 10:30:45 0 [Note] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2023-08-16 10:30:45 0 [Note] WSREP: restore pc from disk failed
2023-08-16 10:30:45 0 [Note] WSREP: GMCast version 0
2023-08-16 10:30:45 0 [Warning] WSREP: Failed to resolve tcp://mariadb-galera-0.mariadb-galera-internal.example.svc.cluster.local:4567
2023-08-16 10:30:45 0 [Warning] WSREP: Failed to resolve tcp://mariadb-galera-1.mariadb-galera-internal.example.svc.cluster.local:4567
2023-08-16 10:30:45 0 [Note] WSREP: (f5bb9252-a8ca, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2023-08-16 10:30:45 0 [Note] WSREP: (f5bb9252-a8ca, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2023-08-16 10:30:45 0 [Note] WSREP: EVS version 1
2023-08-16 10:30:45 0 [Note] WSREP: gcomm: connecting to group 'mariadb-operator', peer 'mariadb-galera-0.mariadb-galera-internal.example.svc.cluster.local:,mariadb-galera-1.mariadb-galera-internal.example.svc.cluster.local:'
2023-08-16 10:30:45 0 [ERROR] WSREP: failed to open gcomm backend connection: 131: No address to connect (FATAL)
at ./gcomm/src/gmcast.cpp:connect_precheck():320
2023-08-16 10:30:45 0 [ERROR] WSREP: ./gcs/src/gcs_core.cpp:gcs_core_open():221: Failed to open backend connection: -131 (State not recoverable)
2023-08-16 10:30:45 0 [ERROR] WSREP: ./gcs/src/gcs.cpp:gcs_open():1669: Failed to open channel 'mariadb-operator' at 'gcomm://mariadb-galera-0.mariadb-galera-internal.example.svc.cluster.local,mariadb-galera-1.mariadb-galera-internal.example.svc.cluster.local': -131 (State not recoverable)
2023-08-16 10:30:45 0 [ERROR] WSREP: gcs connect failed: State not recoverable
2023-08-16 10:30:45 0 [ERROR] WSREP: wsrep::connect(gcomm://mariadb-galera-0.mariadb-galera-internal.example.svc.cluster.local,mariadb-galera-1.mariadb-galera-internal.example.svc.cluster.local) failed: 7
2023-08-16 10:30:45 0 [ERROR] Aborting
I have checked all the previously discussed threads, and nothing has worked. This is my configuration:
apiVersion: mariadb.mmontes.io/v1alpha1
kind: MariaDB
metadata:
name: mariadb-galera
spec:
rootPasswordSecretKeyRef:
name: testdb
key: root-password
database: galera
username: andrzej
passwordSecretKeyRef:
name: testdb
key: password
image:
repository: mariadb
tag: "11.0.3"
pullPolicy: IfNotPresent
replicas: 2
galera:
enabled: true
sst: mariabackup
replicaThreads: 1
agent:
image:
repository: ghcr.io/mariadb-operator/agent
tag: "v0.0.2"
pullPolicy: IfNotPresent
port: 5555
kubernetesAuth:
enabled: true
gracefulShutdownTimeout: 5s
recovery:
enabled: true
clusterHealthyTimeout: 5m
clusterBootstrapTimeout: 10m
podRecoveryTimeout: 10m
podSyncTimeout: 10m
initContainer:
image:
repository: ghcr.io/mariadb-operator/init
tag: "v0.0.5"
pullPolicy: IfNotPresent
volumeClaimTemplate:
resources:
requests:
storage: 1Gi
accessModes:
- ReadWriteOnce
podSecurityContext:
runAsUser: 0
myCnf: |
[mariadb]
bind-address=0.0.0.0
ignore_db_dirs=lost+found
port: 3306
volumeClaimTemplate:
resources:
requests:
storage: 1Gi
accessModes:
- ReadWriteOnce
service:
type: ClusterIP
I tried changing the image to different versions, but it also didn't make any difference.
Can you help me with Galera pls?
from mariadb-operator.
I know these are two different issues, but I'm having trouble with SQL jobs and Galera.
from mariadb-operator.
Hey there,
The galera feature relies heavily on individually addressable Pods via DNS, which is precisely what is failing in the SqlJob
issue. Besides, this galera error seems related:
2023-08-16 10:30:45 0 [Note] WSREP: gcomm: connecting to group 'mariadb-operator', peer 'mariadb-galera-0.mariadb-galera-internal.example.svc.cluster.local:,mariadb-galera-1.mariadb-galera-internal.example.svc.cluster.local:'
2023-08-16 10:30:45 0 [ERROR] WSREP: failed to open gcomm backend connection: 131: No address to connect (FATAL)
I suggest we focus on resolving the DNS issue of the SqlJobs
first, which is something that galera relies on as well. Feel free to take a look at the troubleshooting guide for galera and open a different issue if it doesn't help:
Can we start by creating a simple single-node MariaDB
?:
And later on create a SqlJob
?
If that works, we can move on with galera.
from mariadb-operator.
Hey there @Andrzej9729, I've just spotted a bug in Galera related to this:
Will be fixed in v0.0.20
from mariadb-operator.
Hey @mmontes11, I will try a few other methods and will let you know about the results.
from mariadb-operator.
I tested Galera
locally on Minikube
, and everything worked correctly except for one concerning issue that I've encountered before.
2023-08-21 7:59:58 0 [Note] Plugin 'FEEDBACK' is disabled.
2023-08-21 7:59:58 0 [Note] Plugin 'wsrep-provider' is disabled.
2023-08-21 7:59:58 0 [Note] mariadbd: ready for connections.
Version: '11.0.2-MariaDB-1:11.0.2+maria~ubu2204' socket: '/run/mysqld/mysqld.sock' port: 0 mariadb.org binary distribution
2023-08-21 07:59:59+00:00 [Note] [Entrypoint]: Temporary server started.
2023-08-21 07:59:59+00:00 [Note] [Entrypoint]: Creating database galera
2023-08-21 07:59:59+00:00 [Note] [Entrypoint]: Creating user andrzej
2023-08-21 07:59:59+00:00 [Note] [Entrypoint]: Giving user andrzej access to schema galera
2023-08-21 07:59:59+00:00 [Note] [Entrypoint]: Securing system users (equivalent to running mysql_secure_installation)
2023-08-21 07:59:59+00:00 [Note] [Entrypoint]: Stopping temporary server
2023-08-21 7:59:59 0 [Note] mariadbd (initiated by: unknown): Normal shutdown
2023-08-21 7:59:59 0 [Note] InnoDB: FTS optimize thread exiting.
2023-08-21 7:59:59 0 [Note] InnoDB: Starting shutdown...
2023-08-21 7:59:59 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
2023-08-21 7:59:59 0 [Note] InnoDB: Buffer pool(s) dump completed at 230821 7:59:59
2023-08-21 8:00:00 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2023-08-21 8:00:00 0 [Note] InnoDB: Shutdown completed; log sequence number 47273; transaction id 15
2023-08-21 8:00:00 0 [Note] mariadbd: Shutdown complete
2023-08-21 08:00:00+00:00 [Note] [Entrypoint]: Temporary server stopped
2023-08-21 08:00:00+00:00 [Note] [Entrypoint]: MariaDB init process done. Ready for start up.
2023-08-21 8:00:00 0 [Note] Starting MariaDB 11.0.2-MariaDB-1:11.0.2+maria~ubu2204 source revision 0005f2f06c8e1aea4915887decad67885108a929 as process 1
2023-08-21 8:00:00 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1
2023-08-21 8:00:00 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
2023-08-21 8:00:00 0 [Note] WSREP: wsrep_load(): Galera 26.4.14(r06a0c285) by Codership Oy <[email protected]> loaded successfully.
2023-08-21 8:00:00 0 [Note] WSREP: Initializing allowlist service v1
2023-08-21 8:00:00 0 [Note] WSREP: Initializing event service v1
2023-08-21 8:00:00 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
2023-08-21 8:00:00 0 [Warning] WSREP: Could not open state file for reading: '/var/lib/mysql//grastate.dat'
2023-08-21 8:00:00 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
2023-08-21 8:00:00 0 [Note] WSREP: GCache DEBUG: opened preamble:
Plugin 'wsrep-provider' is disabled
And there is also a warning.
[Warning] WSREP: Could not open state file for reading: '/var/lib/mysql//grastate.dat'
However, in the end. I'm getting everythink working.
NAME READY STATUS RESTARTS AGE
pod/mariadb-galera-0 2/2 Running 0 8m36s
pod/mariadb-galera-1 2/2 Running 0 8m36s
It appears that you are correct, as the issue arises when the Cluster_Name
is different from cluster.local
.
from mariadb-operator.
I noticed that the issue with [Note] Plugin 'wsrep-provider' is disabled.
disappears when we use an older image version, 10.11.3
. In the 11.0.2
image version, there is also a lack of mysql
. It might be worth considering adding a feature that installs mysql
inside the created containers, as mysql
has been removed in the 11.x.x
versions.
from mariadb-operator.
It appears that you are correct, as the issue arises when the Cluster_Name is different from cluster.local.
This was fixed by #206. Will be released in v0.0.20
, stay tuned.
[Note] Plugin 'wsrep-provider' is disabled.
I believe this is just a warning right? Did your operator manage to bootstrap the cluster eventually?
It might be worth considering adding a feature that installs mysql inside the created containers, as mysql has been removed in the 11.x.x versions.
That's not the plan. MariaDB 11.x.x
will no longer be relying on mysql
cc @grooverdan
from mariadb-operator.
Yes, the operator successfully started the cluster using the 10.11.3
image.
[Note] Plugin 'wsrep-provider' is disabled.
is an information (NOTE) message that appears when we use the 11.x.x
image version.
When do you plan to release version v0.0.20
?
from mariadb-operator.
When do you plan to release version v0.0.20?
In the next few week(s). Feel free to subscribe to the GitHub releases or join our Slack channel to stay tuned! 🚀
After releasing, do please test this out again and close if it has been fixed.
from mariadb-operator.
It might be worth considering adding a feature that installs mysql inside the created containers, as mysql has been removed in the 11.x.x versions.
That's not the plan. MariaDB
11.x.x
will no longer be relying onmysql
cc @grooverdan
mariadb
has been there as a client executable since 10.4. I haven't seen a compelling reason to add mysql*
named executables back into the container.
from mariadb-operator.
Fixed in v0.0.20
:
from mariadb-operator.
Related Issues (20)
- [Feature] Support resizing HOT 7
- [Bug] MariaDB fails to restart after the whole cluster rebooted HOT 12
- [Feature] Running operator and databases as non-root user HOT 1
- [Feature] Add priorityClassName for operator and for deployed pods that are part of Mariadb CRD HOT 3
- [Bug] Apply operator and manifest via helmfile in the same time generate racecondition HOT 3
- [Bug] Issue with restarting/stop/start mariadb instance HOT 2
- [Bug] Issue with deleting the backup statefulset and backup of all databases are not taken as desired. HOT 2
- [Bug] livenessProbe expires before initial setup completes HOT 5
- mariadb service not started in mariadb-0 pod HOT 2
- [Bug] Semi-Sync Replication may fail when switch over HOT 6
- [Bug] default (minimal) install depends on external resources HOT 1
- [Bug] Unable to validate certificates HOT 5
- [Bug] Internal error occurred: failed calling webhook HOT 26
- [Bug] When enableing metrics, operator crashes complaining about unrestricted capabilties, etc HOT 1
- [Bug] Unable to validate certificates bis HOT 4
- [Bug] 2nd MariaDB instance crashes instead of joining the cluster HOT 13
- [Bug] mysqld_exporter 0.15.0 breaking change HOT 3
- [Bug] Failover breaks replication if a super privileged user writes data during failover HOT 5
- [Bug] Primary-to-Secondary switching HOT 7
- [Bug] Users and Grants not properly created HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mariadb-operator.