Could there be an problem with the clusterchk.socket and systemd by filling up the system with so many file descriptors that the system can't no longer handle it?
xxxxxxxxxxxxx@server1:~$ sudo systemctl start reboot.target
Failed to start reboot.target: Argument list too long
See system logs and 'systemctl status reboot.target' for details.
xxxxxxxxxxxxxx@server1:~$ sudo systemctl status reboot.target
Failed to get properties: Unknown object '/org/freedesktop/systemd1/unit/reboot_2etarget'.
xxxxxxxxxxxxxx@server1:~$ sudo systemctl reboot
Failed to reboot system via logind: Invalid request descriptor
Failed to start reboot.target: Argument list too long
See system logs and 'systemctl status reboot.target' for details.
xxxxxxxxxxx@server1:~$ sudo journalctl --unit dbus
-- Journal begins at Tue 2023-11-21 22:04:31 CET, ends at Mon 2023-11-27 11:34:07 CET. --
Nov 27 07:31:47 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:32:12 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:32:37 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:33:02 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:33:27 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:33:52 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:34:17 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:34:42 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:35:07 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:35:32 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:35:57 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:36:22 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:36:47 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:37:12 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:37:37 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:38:02 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:38:27 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:38:52 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:39:17 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:39:42 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:40:07 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:40:32 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:40:57 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:41:22 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:41:47 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:42:12 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:42:37 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:43:02 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:43:27 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:43:52 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:44:17 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:44:42 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:45:07 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:45:32 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:45:57 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:46:22 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:46:47 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:47:12 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:47:37 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:48:02 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:48:27 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:48:52 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:49:17 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:49:42 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:50:07 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:50:32 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:50:57 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 07:51:19 server1 dbus-daemon[477]: [system] Successfully activated service 'org.freedesktop.systemd1'
Nov 27 08:15:03 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 08:15:28 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 08:15:53 server1 dbus-daemon[477]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Nov 27 08:16:02 server1 dbus-daemon[477]: [system] Successfully activated service 'org.freedesktop.systemd1'
-- Boot 29a3a40650fe42d4a37bdce0bdf2b71d --
Nov 27 08:19:11 server1 systemd[1]: Started D-Bus System Message Bus.
Nov 27 08:19:28 server1 systemd[1]: Stopping D-Bus System Message Bus...
Nov 27 08:19:28 server1 systemd[1]: dbus.service: Succeeded.
Nov 27 08:19:28 server1 systemd[1]: Stopped D-Bus System Message Bus.
-- Boot 363a9b051c5d4b0289eec63faec54352 --
Nov 27 08:19:45 server1 systemd[1]: Started D-Bus System Message Bus.
xxxxxxxxxxxx@server1:~$ sudo journalctl --unit clusterchk.socket
-- Journal begins at Tue 2023-11-21 22:04:31 CET, ends at Mon 2023-11-27 11:35:11 CET. --
Nov 25 21:18:48 server1 systemd[1]: clusterchk.socket: Failed to queue service startup job (Maybe the service file is missing or not a template unit?): Argument list too long
Nov 25 21:18:48 server1 systemd[1]: clusterchk.socket: Failed with result 'resources'.
Nov 25 21:18:48 server1 systemd[1]: clusterchk.socket: Consumed 26min 57.357s CPU time.
Nov 27 07:51:19 server1 systemd[1]: Listening on Clusterchk socket.
Nov 27 07:51:32 server1 systemd[1]: clusterchk.socket: Failed to queue service startup job (Maybe the service file is missing or not a template unit?): Argument list too long
Nov 27 07:51:32 server1 systemd[1]: clusterchk.socket: Failed with result 'resources'.
-- Boot 29a3a40650fe42d4a37bdce0bdf2b71d --
Nov 27 08:19:11 server1 systemd[1]: Listening on Clusterchk socket.
Nov 27 08:19:33 server1 systemd[1]: clusterchk.socket: Succeeded.
Nov 27 08:19:33 server1 systemd[1]: Closed Clusterchk socket.
-- Boot 363a9b051c5d4b0289eec63faec54352 --
Nov 27 08:19:45 server1 systemd[1]: Listening on Clusterchk socket.
Nov 19 01:38:25 server1 systemd[1]: Started Check the status of Galera/MySQL (xxx.xxx.xxx.xxx:33982).
Nov 19 01:38:25 server1 systemd[1]: Started Check the status of Galera/MySQL (xxx.xxx.xxx.xxx:44600).
Nov 19 01:38:25 server1 clusterchk.sh[4010546]: /bin/echo: write error: Connection reset by peer
Nov 19 01:38:25 server1 clusterchk.sh[4010547]: /bin/echo: write error: Broken pipe
Nov 19 01:38:25 server1 clusterchk.sh[4010548]: /bin/echo: write error: Broken pipe
Nov 19 01:38:25 server1 clusterchk.sh[4010549]: /bin/echo: write error: Broken pipe
Nov 19 01:38:25 server1 systemd[1]: [email protected]:9999-xxx.xxx.xxx.xxx:33982.service: Main process exited, code=exited, status=1/FAILURE
Nov 19 01:38:25 server1 systemd[1]: [email protected]:9999-xxx.xxx.xxx.xxx:33982.service: Failed with result 'exit-code'.
Nov 19 01:38:25 server1 clusterchk.sh[4010555]: /bin/echo: write error: Connection reset by peer
Nov 19 01:38:25 server1 clusterchk.sh[4010556]: /bin/echo: write error: Broken pipe
Nov 19 01:38:25 server1 clusterchk.sh[4010557]: /bin/echo: write error: Broken pipe
Nov 19 01:38:25 server1 clusterchk.sh[4010558]: /bin/echo: write error: Broken pipe
Nov 19 01:38:25 server1 systemd[1]: [email protected]:9999-xxx.xxx.xxx.xxx:44600.service: Main process exited, code=exited, status=1/FAILURE
Nov 19 01:38:25 server1 systemd[1]: [email protected]:9999-xxx.xxx.xxx.xxx:44600.service: Failed with result 'exit-code'.
Nov 25 21:18:48 server1 systemd[1]: cannot add name, manager has too many units: Argument list too long
Nov 25 21:18:48 server1 systemd[1]: clusterchk.socket: Failed to queue service startup job (Maybe the service file is missing or not a template unit?): Argument list too long
Nov 25 21:18:48 server1 systemd[1]: clusterchk.socket: Failed with result 'resources'.
Nov 25 21:18:48 server1 systemd[1]: clusterchk.socket: Consumed 26min 57.357s CPU time.
Nov 26 00:00:08 server1 systemd[1]: cannot add name, manager has too many units: Argument list too long
Nov 26 00:00:08 server1 systemd[1]: cannot add name, manager has too many units: Argument list too long
Nov 26 00:00:08 server1 systemd[1]: cannot add name, manager has too many units: Argument list too long
It leads to an very slow system without the possibility to reboot (only hard reset at vm level possible).