Git Product home page Git Product logo

lmd's People

Contributors

c0xc avatar danirod avatar dependabot[bot] avatar dgilm avatar erikdsjostrom avatar jacobbaungard avatar jdumalaonitrs avatar jimorie avatar klaernie avatar lausser avatar llange avatar metricspace avatar sjoegren avatar sni avatar tmuncks avatar vaxvms avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lmd's Issues

Parent Host accross LMD

I m looking for a solution or documentation to use naemon host parentality with LMD.
We need to test if ressources is unreacheable when our shared infra is down ...
Can you explain me if its possible or how ??? Thx

schema

Compatibility with older livestatus releases - `Table 'hosts' has no column 'event_handler'`

Hello,

On some installations I'm stuck with and older release of livestatus (1.1.12p6 - the check_mk one, for Nagios3).
The column event_handler was added to the hoststable some releases after (I believe in 1.2.0).

LMD tries to access this column at startup, and the error Table 'hosts' has no column 'event_handler' prevents a further use of these backends.

Would it be possible to handle this special -legacy- case ?

Icinga2 problems after upgrading to 1.1.4

After upgrading to LMD 1.1.4 thruk stops showing any data and gives:

No Backend available
None of the configured Backends could be reached, please have a look at the logfile for detailed information and make sure the core is up and running.

Details:
icinga-test-srv02.example.com: broken: got more services than expected. Hint: check clients 'max_response_size' setting. (/var/run/icinga2/cmd/livestatus)

It could be related to this commit: 07f5358

And the problem might be that some object has been added via the api that lmd doesn't know about.

Panic: assignment to entry in nil map

Saddened to announce something similar to #46 is happening here again. However, the stack trace message is slightly different, and the panic reason is different as well: assignment to entry in nil map, located in:

    runtime/debug.Stack(0xc4201c4030, 0x9472c1, 0xe)
            /root/.gvm/gos/go1.9.2/src/runtime/debug/stack.go:24 +0xa7
    main.logPanicExitPeer(0xc42028a0c0)
            /var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:2754 +0x12f
    panic(0x8b9ce0, 0x9a4dd0)
            /root/.gvm/gos/go1.9.2/src/runtime/panic.go:491 +0x283
    main.(*DataTable).AddItem(...)
            /var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:153
    main.(*Peer).UpdateDeltaCommentsOrDowntimes(0xc42028a0c0, 0x943b38, 0x8, 0x0, 0x0)
            /var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:1128 +0x1453
    main.(*Peer).UpdateDeltaTables(0xc42028a0c0, 0x944e05)
            /var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:798 +0xb1d
    main.(*Peer).periodicUpdate(0xc42028a0c0, 0xc42ee75e7e, 0xc42ee75e80)
            /var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:415 +0x370
    main.(*Peer).updateLoop(0xc42028a0c0)
            /var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:346 +0x30b
    main.(*Peer).Start.func1(0xc42028a0c0)
            /var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:245 +0x59
    created by main.(*Peer).Start
            /var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:242 +0x17f

This time I have more detail to provide, although I haven't been able to reproduce this bug locally despite trying hard. I'm just providing data discovered after analyzing days of lmd logs in one of our client installations. I even increased the logging level with the hope of catching some extra details, but I've had no luck (and received a gigabyte log file as a side effect).

So the thing that I see is that the panic always happen after connection with the site is lost. In order words, I always see up to a minute before the panic happens the warning site went offline: dial unix ...{path to livestatus socket file}... : connect: resource temporarily unavailable in the logs. It's not reciprocal: getting a peer down does not mean that lmd will crash, although it always crashes after a peer is down. I'm wondering if the same is true for issue #46, but haven't checked yet.

Happy to provide more feedback if needed, although I still have no idea how to reproduce this locally because I don't understand how is the peer failing in this case.

Some backend configs are read-only

Hi,
since I activated LMD, the config of some backends is read-only on the umbrella. If LMD is deactivated, they can be edited.
Is there a setting in ~/etc/thruk/lmd.ini ?

All backends are 'http' backends

Or could it be a version issue? Umbrella = OMD 4.00, Backends= OMD 3.30

Thx & BR Oliver

Panic: runtime error: invalid memory address or nil pointer dereference

I run a test setup with two backends and one frontend. All sites are using Checkmk Raw edition 1.6.0p6.

lmd occasionally crashes with this stack trace:

[2019-12-10 15:47:55.854][Error][main.go:736] Panic: runtime error: invalid memory address or nil pointer dereference
[2019-12-10 15:47:55.855][Error][main.go:737] Version: 1.7.1 (Build: )
[2019-12-10 15:47:55.855][Error][main.go:738] goroutine 195 [running]:
runtime/debug.Stack(0x98f6c0, 0xc0003cd5b0, 0x2)
	/opt/golang/go/src/runtime/debug/stack.go:24 +0x9d
main.logPanicExit()
	/home/harald/go/src/github.com/sni/lmd/lmd/main.go:738 +0x2c3
panic(0x9c15e0, 0xf1e5e0)
	/opt/golang/go/src/runtime/panic.go:679 +0x1b2
main.(*DataStore).GetColumn(...)
	/home/harald/go/src/github.com/sni/lmd/lmd/datastore.go:130
main.GetGroupByData(0xc000033000, 0xc0002360e0, 0x0)
	/home/harald/go/src/github.com/sni/lmd/lmd/virtstore.go:63 +0xb4b
main.(*Peer).GetDataStore(0xc0002360e0, 0x11, 0x9, 0x98e500, 0xc00049c850)
	/home/harald/go/src/github.com/sni/lmd/lmd/peer.go:2902 +0x1c0
main.(*Response).BuildLocalResponse(0xc00008e960, 0xc0003cd3a0, 0x2, 0x2)
	/home/harald/go/src/github.com/sni/lmd/lmd/response.go:483 +0x14f
main.NewResponse(0xc0000eaf20, 0x32f2513100a38820, 0x5defb01b, 0xc0002819d8)
	/home/harald/go/src/github.com/sni/lmd/lmd/response.go:101 +0x43c
main.(*Request).GetResponse(0xc0000eaf20, 0x4899e9948, 0xf309e0, 0x0)
	/home/harald/go/src/github.com/sni/lmd/lmd/request.go:348 +0x147
main.ProcessRequests(0xc0001380b0, 0x1, 0x1, 0xb4b600, 0xc000138068, 0xc00049c4e0, 0xf, 0xc0000ea840, 0x0, 0x0, ...)
	/home/harald/go/src/github.com/sni/lmd/lmd/listener.go:150 +0x30a
main.QueryServer(0xb4b600, 0xc000138068, 0xc0000ea840, 0x0, 0x0)
	/home/harald/go/src/github.com/sni/lmd/lmd/listener.go:98 +0x3f2
main.(*Listener).LocalListenerLivestatus.func2.1(0xc00008e780, 0xb4b600, 0xc000138068, 0xc00014a080)
	/home/harald/go/src/github.com/sni/lmd/lmd/listener.go:330 +0x76
created by main.(*Listener).LocalListenerLivestatus.func2
	/home/harald/go/src/github.com/sni/lmd/lmd/listener.go:326 +0xc1

My configuration looks like:

Listen          = ["127.0.0.1:3333"]

[[Connections]]
name   = "Test 3"
id     = "test3"
source = ["127.0.0.1:6558"]

[[Connections]]
name   = "Test 5"
id     = "test5"
source = ["127.0.0.1:6557"]

# all the other directives are exactly like in lmd.ini.example.

LMD 2.0.0 fails updating delta for host/service with shinken backend

Shinken is receiving the following incomplete livestatus request (without GET Services or GET hosts and other stuff):

Filter: is_executing = 1

Disabling SyncIsExecuting in lmd.ini, which is not the default value, host/service delta updates are performed the right way.

SyncIsExecuting = false

With LMD 1.9.4 there is no such problem.

Getting error message starting lmd failed with rc -1: [No child processes]

Hi,

I am running Thruk 2.14-1 with lmd (1.0.3), go 1.6.3 and icinga2-2.4.10-1 versions on Centos 7 64-bit. For past 2 days when running SLA reports on Thruk was getting this error "malformed UTF-8 character in JSON string, at character offset 5709811 (before "\x{fffd}\x{fffd}is a...") at /usr/share/thruk/lib/Monitoring/Livestatus.pm line 1167". Searched on thruk site and it was recommended to run lmd to get rid of the above errors. So install the lmd by following this page: https://www.thruk.org/documentation/lmd.html.

The issue now I am facing is that as soon I restart httpd and lmd starts the Thruk stops showing any host from the icinga. After stopping the lmd the hosts appears on the Thruk but the SLA reports are broken. It seems lmd is getting crashed. Do let me know if you need any further information.

Thruk Backened(livestatus) is connected to icinga2 on both socket level and at the tcp level.

The error message from thruk logs

[ERROR][Thruk] lmd not running, starting up...
[2017/04/26 07:49:18][icingamaster1.example.com][ERROR][Thruk] starting lmd failed with rc -1: [No child processes]
[2017/04/26 07:50:26][icingamaster1.example.com][ERROR][Thruk] starting lmd failed with rc -1: [No child processes]
[2017/04/26 07:50:26][icingamaster1.example.com][ERROR][Thruk] lmd not running, starting up...
[2017/04/26 07:50:26][icingamaster1.example.com][ERROR][Thruk] starting lmd failed with rc -1: [No child processes]
[2017/04/26 07:50:46][icingamaster1.example.com][ERROR][Thruk] unknown report type

lmd.log

[2017-04-26 07:38:20][Fatal][main.go:180] no connections defined
[2017-04-26 07:39:59][Fatal][main.go:180] no connections defined
[2017-04-26 07:40:51][Warn][peer.go:1039] [icinga2] site went offline: [icinga2] bad response: 452 1291
Error: Column 'worst_service_state' does not exist in table 'hostgroups'.

(0) liblivestatus.so: void boost::throw_exception<boost::exception_detail::error_info_injector<std::invalid_argument> >(boost::exception_detail

::error_info_injectorstd::invalid_argument const&) (+0xf8) [0x2b35803e07f8]
(1) liblivestatus.so: void boost::exception_detail::throw_exception_std::invalid_argument(std::invalid_argument const&, char const*, char con
st*, int) (+0x79) [0x2b35803e08c9]
(2) liblivestatus.so: icinga::Table::GetColumn(icinga::String const&) const (+0x374) [0x2b3580386184]
(3) liblivestatus.so: icinga::LivestatusQuery::ExecuteGetHelper(boost::intrusive_ptricinga::Stream const&) (+0xd58) [0x2b35803d33c8]
(4) liblivestatus.so: icinga::LivestatusQuery::Execute(boost::intrusive_ptricinga::Stream const&) (+0x237) [0x2b35803d3c37]
(5) liblivestatus.so: icinga::LivestatusListener::ClientHandler(boost::intrusive_ptricinga::Socket const&) (+0x1c7) [0x2b35803d4047]
(6) libbase.so: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) (+0x378) [0x2b356f19ed98]
(7) libboost_thread-mt.so.1.53.0: (+0xd24a) [0x2b356e70124a]
(8) libpthread.so.0: (+0x7df5) [0x2b3571aa5df5]
(9) libc.so.6: clone (+0x6d) [0x2b3571db01ad]

lmi.ini
Listen = ['/var/cache/thruk/lmd/live.sock']
LogFile = '/var/cache/thruk/lmd/lmd.log'
LogLevel = 'Warn'

[[Connections]]
name = 'icinga2'
id = '11120'
source = ['localhost:6558']

Thanks & Regards

Ankush Grover

Add csv OutputFormat

Livestatus supports csv output, which we currently use. We would like to start using LMD, but many of our existing code relies on the csv ouput of Livestatus. Please consider adding csv output to LMD, as it would be very useful for compatibility and simplicity.

Inconsistend display of downtimes in thruk (2.44.3) with lmd+icinga2

Whenever i apply a downtime on multiple services, the effect is not being displayed immediately on the list.

Even after 20 minutes i see maybe 50% of services with downtime. Disabling lmd restores normal visibility. I tested this when using lmd with icinga2 ha cluster (2 peers in lmd), and with single instance.

Inspecting individual services via thruk displays their downtimes correctly.

bad response 404 on Icinga 2.11

bad response: 404 31 error on lmd, fresh install. Tried thru Thruk and standalone lmd. Here is the trace info:

`
[2020-01-03 03:10:20.331][Info][listener.go:296] listening for incoming queries on unix /tmp/lmd.sock
[2020-01-03 03:10:20.332][Info][listener.go:365] listening for rest queries on :8080
[2020-01-03 03:10:20.332][Info][listener.go:296] listening for incoming queries on tcp 127.0.0.1:3333
[2020-01-03 03:10:20.332][Info][peer.go:251] [Prod] starting connection
[2020-01-03 03:10:20.332][Trace][peer.go:1632] [Prod] no cached connection found
[2020-01-03 03:10:20.332][Trace][peer.go:1322] [Prod] query: GET status
ResponseHeader: fixed16
OutputFormat: json
Columns: program_start accept_passive_host_checks accept_passive_service_checks cached_log_messages check_external_commands check_host_freshness check_service_freshness connections connections_rate enable_event_handlers enable_flap_detection enable_notifications execute_host_checks execute_service_checks forks forks_rate host_checks host_checks_rate interval_length last_command_check last_log_rotation livestatus_version log_messages log_messages_rate nagios_pid neb_callbacks neb_callbacks_rate obsess_over_hosts obsess_over_services process_performance_data program_version requests requests_rate service_checks service_checks_rate
KeepAlive: on

[2020-01-03 03:10:20.333][Trace][peer.go:1359] [Prod] result: [[1576361802.0,1.0,1.0,0.0,1.0,1.0,1.0,3208.0,0.0019334410862108127,1.0,1.0,1.0,1.0,1.0,0.0,0.0,43476.0,0.026202707189399623,60.0,0.0,0.0,"2.11.2-1",0.0,0.0,42381.0,0.0,0.0,0.0,0.0,1.0,"2.11.2-1",0.0,0.0,601108.0,0.36228394776641054]]
[2020-01-03 03:10:20.333][Debug][peer.go:1381] [Prod] fetched table: status - time: 0s - count: 1 - size: 0 kB
[2020-01-03 03:10:20.333][Trace][peer.go:1632] [Prod] no cached connection found
[2020-01-03 03:10:20.334][Trace][peer.go:1322] [Prod] query: GET columns
ResponseHeader: fixed16
OutputFormat: json
Columns: table name
KeepAlive: on

[2020-01-03 03:10:20.334][Debug][peer.go:1475] [Prod] LastQuery:
[2020-01-03 03:10:20.334][Debug][peer.go:1476] [Prod] GET columns
ResponseHeader: fixed16
OutputFormat: json
Columns: table name
KeepAlive: on

[2020-01-03 03:10:20.334][Debug][peer.go:1342] [Prod] sending data/query failed: [Prod] bad response: 404 31
[2020-01-03 03:10:20.334][Trace][peer.go:1308] [Prod] put cached connection back
[2020-01-03 03:10:20.334][Debug][peer.go:1665] [Prod] connection error 192.168.0.101:6666: [Prod] bad response: 404 31
[2020-01-03 03:10:20.334][Debug][peer.go:1699] [Prod] last online: never
[2020-01-03 03:10:20.334][Info][peer.go:1702] [Prod] site went offline: [Prod] bad response: 404 31
[2020-01-03 03:10:25.334][Trace][peer.go:1632] [Prod] no cached connection found
[2020-01-03 03:10:25.335][Trace][peer.go:1322] [Prod] query: GET status
ResponseHeader: fixed16
OutputFormat: json
Columns: program_start accept_passive_host_checks accept_passive_service_checks cached_log_messages check_external_commands check_host_freshness check_service_freshness connections connections_rate enable_event_handlers enable_flap_detection enable_notifications execute_host_checks execute_service_checks forks forks_rate host_checks host_checks_rate interval_length last_command_check last_log_rotation livestatus_version log_messages log_messages_rate nagios_pid neb_callbacks neb_callbacks_rate obsess_over_hosts obsess_over_services process_performance_data program_version requests requests_rate service_checks service_checks_rate
KeepAlive: on

[2020-01-03 03:10:25.339][Trace][peer.go:1359] [Prod] result: [[1576361802.0,1.0,1.0,0.0,1.0,1.0,1.0,3212.0,0.0019358460202393028,1.0,1.0,1.0,1.0,1.0,0.0,0.0,43418.0,0.026167672013151556,60.0,0.0,0.0,"2.11.2-1",0.0,0.0,42381.0,0.0,0.0,0.0,0.0,1.0,"2.11.2-1",0.0,0.0,599096.0,0.3610702388930429]]
[2020-01-03 03:10:25.339][Debug][peer.go:1381] [Prod] fetched table: status - time: 0s - count: 1 - size: 0 kB
[2020-01-03 03:10:25.339][Trace][peer.go:1632] [Prod] no cached connection found
`

Any ideas? Looks like it stopped working with 2.11, but not quite sure yet

Thanks!

Objects are considered static and does not change when updated via Icinga2 http API

Previously a lot of configuration such as contacts, hosts and probably lots of other things has been considered static in nagios-like monitoring systems. This used to make sense since adding or removing them required a restart of the monitoring system. With this in mind LMD does not add or remove such objects until the monitoring backend is restarted.

However, when using the Icinga2 http api, a restart of the backend is no more required for adding or removing such objects. Therefore LMD needs to somehow keep track of if these objects have changed and refresh them if neccessary.

Example: A new user is added via Icinga2 api and tries to log in to Thruk using some external authentication mechanism in apache, this does not work since LMD does not recognize that the user exists until Icinga2 is restarted.

Example 2: old hosts are removed via the Icinga2 api, they will continue to be visible in Thruk even though they are no longer configured in Icinga2 until Icinga2 is restarted.

One way could be for LMD to simply check the number of objects in the backend and compare to the number of objects in the cache and determine at least if there is a discrepancy, and if so refresh the cache.

Pass through requests directly to a backend if the backend is online, but its cache is not fresh

We have two backend types that are unfortunately not a hand-in-glove fit for the current implementation of LMD.

One type is our Icinga backends, which appear to be bad at signaling that they have fresh data for the LMD. This means that they will only get refreshed every 5 minutes, even if they have new data to offer every second.

Another type is our huge Nagios backends (100.000s of checks), that can take up to 10 minutes or more for the LMD to refresh. These Nagios backends do have the advantage over the Icinga backends, that once they have been refreshed, the delta update works fast.

Unfortunately, we have a very busy environment with a lot of changes coming in. Therefore we would like to reload the backends often, for the configuration updates to reflect as quickly as possible. The Nagios reload in itself does not present a problem for this to happen, as it reloads fast, typically in 30s. But, the 10m LMD refresh is a challenge, as we would like to reload more often than the 30m that we feel is possible for now - a reload every 30m leaves 20m of responsive GUI time with the Nagios backends.

This 10m delay time while waiting for the LMD to refresh the backend means that we are often left with stale check results in our console. And worse, with no visibility of acknowledgements and comments that we have added. The check results will be updated eventually, and acknowledgements and comments will show up eventually (after 10m), so the LMD functionality is ok. Just, delayed.

We love the LMD very much, and we would like to investigate options for bringing interactions and responses more "online", even with these two badly behaving backend types that we have.

An idea that we would like to propose, is to add an option to pass requests directly through to the backend, if the backend is online, but its cache is not fresh.

The LMD cache would still be refreshing in the background as it currently does. And whenever the cache is fresh, the LMD will use the cache as its source (putting less CPU load on our Nagios backends). And, whenever the LMD cache is not fresh but the backend is up, the LMD will use the backend as source (making our Icinga backends responsive to work with, even if the LMD cache would seldom be used).

I was playing around with a sequence diagram, I am attaching it just to show my current understanding of how things currently work with the LMD.

We don't know if this kind of change is possible at all, but we would love to hear your thoughts.

Thruk-LMD-Backends-sequence-diagram

LMD panic

Hi Sven, I just got my first panic on 1.1.5 since a long time:

[2018-02-26 12:56:05][Error][peer.go:2060] [Naemon] Panic: interface conversion: interface {} is nil, not []interface {}
[2018-02-26 12:56:05][Error][peer.go:2061] [Naemon] goroutine 35 [running]:
runtime/debug.Stack(0xc42015a9c0, 0x8eadab, 0xe)
        /usr/lib/golang/src/runtime/debug/stack.go:24 +0xa7
main.logPanicExitPeer(0xc4200a8c60)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:2061 +0x12f
panic(0x86a580, 0xc44735ac80)
        /usr/lib/golang/src/runtime/panic.go:491 +0x283
main.(*Peer).parseResult.func1(0xc46a831d66, 0x1c6, 0x87c09a, 0x4, 0x783d56, 0x0, 0x0)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:941 +0x186
github.com/buger/jsonparser.ArrayEach(0xc46a0ae010, 0x83e19a, 0xfffdf0, 0xc4657c7510, 0x0, 0x0, 0x0, 0x83e1aa, 0x389, 0x83aa00)
        /home/musso/src/go_workspace/src/github.com/buger/jsonparser/parser.go:898 +0x2e8
main.(*Peer).parseResult(0xc4200a8c60, 0xc4203ba000, 0xc446cbc0c0, 0xc458834000, 0x3153, 0x32aa, 0xb21a80, 0xc443f84188)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:939 +0x70b
main.(*Peer).query(0xc4200a8c60, 0xc4203ba000, 0x0, 0x0, 0x0, 0x0, 0x0)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:916 +0x8a6
main.(*Peer).Query(0xc4200a8c60, 0xc4203ba000, 0x2, 0x11, 0xc4657c7888, 0x2, 0x2)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:1005 +0x39
main.(*Peer).UpdateDeltaTableServices(0xc4200a8c60, 0xc43446bb00, 0x867, 0x0, 0x0)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:635 +0x289
main.(*Peer).UpdateDeltaTableFullScan(0xc4200a8c60, 0xc42018ad00, 0xc448419080, 0x40, 0x1, 0xc448419080, 0x40)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:720 +0xb08
main.(*Peer).UpdateDeltaTableServices(0xc4200a8c60, 0xc448419080, 0x40, 0x0, 0x0)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:623 +0x9f9
main.(*Peer).UpdateDeltaTables(0xc4200a8c60, 0x8e89b9)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:529 +0xb0f
main.(*Peer).periodicUpdate(0xc4200a8c60, 0xc4657c7e7e, 0xc4657c7e80)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:366 +0x370
main.(*Peer).updateLoop(0xc4200a8c60)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:298 +0x17b
main.(*Peer).Start.func1(0xc4200a8c60)
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:208 +0x59
created by main.(*Peer).Start
        /home/musso/src/go_workspace/src/github.com/sni/lmd/lmd/peer.go:205 +0x17f
[2018-02-26 12:56:05][Error][peer.go:2063] [Naemon] LastQuery:
[2018-02-26 12:56:05][Error][peer.go:2064] [Naemon] GET services
ResponseHeader: fixed16
OutputFormat: json
Columns: accept_passive_checks acknowledged acknowledgement_type active_checks_enabled check_options check_type checks_enabled comments current_attempt current_notification_number downtimes event_handler_enabled execution_time first_notification_delay flap_detection_enabled has_been_checked in_check_period in_notification_period is_executing is_flapping last_check last_hard_state last_hard_state_change last_notification last_state last_state_change last_time_critical last_time_warning last_time_ok last_time_unknown latency long_plugin_output low_flap_threshold modified_attributes modified_attributes_list next_check next_notification notifications_enabled obsess_over_service percent_state_change perf_data plugin_output process_performance_data scheduled_downtime_depth state state_type staleness host_name description
Filter: last_check >= 1519649758
Filter: is_executing = 1
Or: 2
Filter: last_check = 1519649736
Filter: last_check = 1519649760
Filter: last_check = 1519649747
Filter: last_check = 1519649690
Filter: last_check = 1519649738
Filter: last_check = 1519649712
Filter: last_check = 1519649709
Filter: last_check = 1519649664
Filter: last_check = 1519649741
Filter: last_check = 1519649757
Filter: last_check = 1519649734
Filter: last_check = 1519649748
Filter: last_check = 1519649718
Filter: last_check = 1519649737
Filter: last_check = 1519649758
Filter: last_check = 1519649660
Filter: last_check = 1519649739
Filter: last_check = 1519649759
Filter: last_check = 1519649745
Filter: last_check = 1519649725
Filter: last_check = 1519649730
Filter: last_check = 1519649729
Filter: last_check = 1519649754
Filter: last_check = 1519649673
Filter: last_check = 1519649681
Filter: last_check = 1519649716
Filter: last_check = 1519649750
Filter: last_check = 1519649722
Filter: last_check = 1519649733
Filter: last_check = 1519649691
Filter: last_check = 1519649657
Filter: last_check = 1519649710
Filter: last_check = 1519649662
Filter: last_check = 1519649727
Filter: last_check = 1519649756
Filter: last_check = 1519649724
Filter: last_check = 1519649706
Filter: last_check = 1519649682
Filter: last_check = 1519649753
Filter: last_check = 1519649665
Filter: last_check = 1519649676
Filter: last_check = 1519649696
Filter: last_check = 1519649720
Filter: last_check = 1519649695
Filter: last_check = 1519649746
Filter: last_check = 1519649755
Filter: last_check = 1519649732
Filter: last_check = 1519649719
Filter: last_check = 1519649671
Filter: last_check = 1519649717
Filter: last_check = 1519649675
Filter: last_check = 1519649752
Filter: last_check = 1519649714
Filter: last_check = 1519649698
Filter: last_check = 1519649721
Filter: last_check = 1519649692
Filter: last_check = 1519649728
Filter: last_check = 1519649735
Filter: last_check = 1519649674
Filter: last_check = 1519649749
Filter: last_check = 1519649742
Filter: last_check = 1519649726
Filter: last_check = 1519649731
Filter: last_check = 1519649751
Filter: last_check = 1519649743
Or: 66

[2018-02-26 12:56:05][Error][peer.go:2065] [Naemon] LastResponse:

Followed by a huge array of services...

Any chance that this is fixed in 1.2.0?

Bad request: table statehist does not exist

LMD responding "400 - bad request: table statehist does not exist". When directly accesing livestatus OK, but when proxied through lmd.
Or there are another way to get state history using lmd ?

No AuthUser support in lmd?

 Hey, as far as I have looked for in the code and tried queries, there is no support for authuser on queries sent to lmd right? So all user will see all hosts and services, regardless of permitions?

Tks.

[main.go:680] got sigint, quitting - lmd reloads backends for no apparent reason, sometimes

After we applied the 0980d57 fix, the frequency of this error has increased in our setup.

When it appears, it has the effect that lmd appears to flush all the backends, and the backends need to start over from the beginning. This is seen as red backends in Thruk.

[2021-01-19 07:33:03.111][Info][main.go:678] got sigint, quitting
[2021-01-19 09:36:55.921][Info][main.go:678] got sigint, quitting
[2021-01-19 10:20:56.892][Info][main.go:678] got sigint, quitting
[2021-01-19 14:43:40.910][Info][main.go:678] got sigint, quitting
[2021-01-20 03:41:23.789][Info][main.go:678] got sigint, quitting
[2021-01-20 08:50:03.037][Info][main.go:678] got sigint, quitting
[2021-01-20 11:03:09.320][Info][main.go:678] got sigint, quitting
[2021-01-20 12:28:15.770][Info][main.go:678] got sigint, quitting
[2021-01-20 12:55:48.134][Info][main.go:678] got sigint, quitting
[2021-01-20 15:06:04.177][Info][main.go:678] got sigint, quitting
[2021-01-21 00:35:03.777][Info][main.go:678] got sigint, quitting
[2021-01-21 02:01:09.637][Info][main.go:678] got sigint, quitting
[2021-01-21 10:04:05.032][Info][main.go:678] got sigint, quitting
-- lmd upgrade --
[2021-01-21 11:33:11.370][Info][main.go:680] got sigint, quitting
[2021-01-21 12:02:43.109][Info][main.go:680] got sigint, quitting
[2021-01-21 12:13:40.696][Info][main.go:680] got sigint, quitting
[2021-01-21 12:30:43.110][Info][main.go:680] got sigint, quitting
[2021-01-21 12:44:31.617][Info][main.go:680] got sigint, quitting
[2021-01-21 14:06:51.444][Info][main.go:680] got sigint, quitting
[2021-01-21 14:34:52.465][Info][main.go:680] got sigint, quitting
[2021-01-21 14:46:27.096][Info][main.go:680] got sigint, quitting
[2021-01-21 14:57:50.978][Info][main.go:680] got sigint, quitting

The reloads that fall around :01-04 and :31-34 can be correlated with restarts of the backends.

The reloads that falls outside these intervals appears to just happen spontaneously. For instance the latest reloads at 14:46:27.096 and 14:57:50.978.

The lmd.log reports a varying number of socket error messages when this happens, and then finally starts over.

# cat  /var/log/lmd.log | grep -B 1 -A 30 "got sigint"
[2021-01-21 02:01:07.637][Info][listener.go:165] incoming sites request from @ to /var/cache/thruk/lmd/live.sock finished in 557.912ยตs, response size: 2.9KiB
[2021-01-21 02:01:09.637][Info][main.go:678] got sigint, quitting
[2021-01-21 02:01:09.637][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 02:01:09.637][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 02:01:09.638][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 02:01:09.638][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 02:01:13.104][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 02:01:13.104][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 02:01:13.105][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 02:01:13.105][Info][peer.go:311] [NCE-4] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [AME-1] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [NCE-1] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [NCE-3] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [MED-1] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [ASP-1] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [APPL-1] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [NCE-2] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [MED-3] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [NAG99] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [MED-2] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [APPL-ChinaHUB] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [CHI-1] starting connection
[2021-01-21 02:01:13.105][Info][peer.go:311] [Master] starting connection
[2021-01-21 02:01:13.487][Info][peer.go:807] [APPL-1] objects created in: 381.724743ms
[2021-01-21 02:01:13.588][Info][peer.go:807] [NAG99] objects created in: 482.185439ms
[2021-01-21 02:01:14.722][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 1.609623ms, response size: 736 B
[2021-01-21 02:01:14.723][Info][listener.go:165] incoming sites request from @ to /var/cache/thruk/lmd/live.sock finished in 302.309ยตs, response size: 2.0KiB
[2021-01-21 02:01:14.942][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 1.197946ms, response size: 90 B
[2021-01-21 02:01:14.950][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 2.820504ms, response size: 90 B
[2021-01-21 02:01:14.972][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 91.717ยตs, response size: 90 B
[2021-01-21 02:01:14.978][Info][listener.go:165] incoming hosts request from @ to /var/cache/thruk/lmd/live.sock finished in 128.58ยตs, response size: 72 B
[2021-01-21 02:01:14.992][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 4.793161ms, response size: 80 B
--
[2021-01-21 10:04:04.937][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 2.56009845s, response size: 69.8MiB
[2021-01-21 10:04:05.032][Info][main.go:678] got sigint, quitting
[2021-01-21 10:04:05.033][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 10:04:05.033][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 10:04:05.033][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 10:04:05.033][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 10:04:05.626][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 10:04:11.008][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 10:04:11.009][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 10:04:11.009][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 10:04:11.009][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 10:04:11.009][Info][peer.go:311] [AME-1] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [NCE-3] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [MED-1] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [NCE-4] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [MED-3] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [CHI-1] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [NCE-2] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [MED-2] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [NAG99] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [APPL-1] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [APPL-ChinaHUB] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [Master] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [NCE-1] starting connection
[2021-01-21 10:04:11.010][Info][peer.go:311] [ASP-1] starting connection
[2021-01-21 10:04:11.097][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 265.294ยตs, response size: 80 B
[2021-01-21 10:04:11.422][Info][peer.go:807] [APPL-1] objects created in: 411.30088ms
[2021-01-21 10:04:11.505][Info][peer.go:807] [NAG99] objects created in: 494.419291ms
[2021-01-21 10:04:11.596][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 128.071ยตs, response size: 80 B
[2021-01-21 10:04:11.981][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 234.973ยตs, response size: 540 B
[2021-01-21 10:04:11.983][Info][listener.go:165] incoming sites request from @ to /var/cache/thruk/lmd/live.sock finished in 336.137ยตs, response size: 2.0KiB
[2021-01-21 10:04:12.094][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 1.99135ms, response size: 3 B
--
[2021-01-21 11:33:11.299][Info][listener.go:165] incoming hosts request from @ to /var/cache/thruk/lmd/live.sock finished in 51.212563ms, response size: 72 B
[2021-01-21 11:33:11.370][Info][main.go:680] got sigint, quitting
[2021-01-21 11:33:11.370][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:11.371][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 11:33:11.371][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:11.371][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 11:33:11.780][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:11.780][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 11:33:16.858][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:16.858][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:16.859][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:16.859][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 11:33:21.890][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:21.890][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 11:33:22.716][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:22.716][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:22.716][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:22.717][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 11:33:22.717][Info][peer.go:309] [NCE-2] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [CHI-1] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [NCE-1] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [NCE-3] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [MED-2] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [NAG99] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [APPL-ChinaHUB] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [NCE-4] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [MED-1] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [ASP-1] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [APPL-1] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [AME-1] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [MED-3] starting connection
[2021-01-21 11:33:22.717][Info][peer.go:309] [Master] starting connection
--
[2021-01-21 12:02:40.940][Info][listener.go:165] incoming sites request from @ to /var/cache/thruk/lmd/live.sock finished in 393.819ยตs, response size: 4.5KiB
[2021-01-21 12:02:43.109][Info][main.go:680] got sigint, quitting
[2021-01-21 12:02:43.109][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 12:02:43.109][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 12:02:43.109][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 12:02:43.109][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 12:02:43.496][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:02:43.496][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:02:48.848][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:02:48.849][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:02:48.849][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:02:48.849][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:02:48.849][Info][peer.go:309] [NCE-2] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [MED-1] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [MED-2] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [Master] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [NCE-3] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [AME-1] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [APPL-1] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [APPL-ChinaHUB] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [CHI-1] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [MED-3] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [NCE-1] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [NCE-4] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [ASP-1] starting connection
[2021-01-21 12:02:48.849][Info][peer.go:309] [NAG99] starting connection
[2021-01-21 12:02:49.055][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 292.995ยตs, response size: 574 B
[2021-01-21 12:02:49.211][Info][peer.go:833] [APPL-1] objects created in: 361.504581ms
[2021-01-21 12:02:49.357][Info][peer.go:833] [NAG99] objects created in: 507.294699ms
[2021-01-21 12:02:49.430][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 146.867ยตs, response size: 80 B
[2021-01-21 12:02:49.688][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 156.841ยตs, response size: 736 B
[2021-01-21 12:02:49.690][Info][listener.go:165] incoming sites request from @ to /var/cache/thruk/lmd/live.sock finished in 388.653ยตs, response size: 2.0KiB
--
[2021-01-21 12:13:38.913][Warn][datastoreset.go:431] [Master] services delta scan timestamp filter too complex: 302
[2021-01-21 12:13:40.696][Info][main.go:680] got sigint, quitting
[2021-01-21 12:13:40.696][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 12:13:40.696][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 12:13:40.696][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 12:13:40.697][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 12:13:41.075][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:13:41.075][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:13:46.181][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:13:46.181][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:13:46.181][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:13:46.182][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:13:46.182][Info][peer.go:309] [ASP-1] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [AME-1] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [NAG99] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [NCE-2] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [MED-2] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [APPL-ChinaHUB] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [Master] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [NCE-3] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [APPL-1] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [MED-1] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [MED-3] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [CHI-1] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [NCE-1] starting connection
[2021-01-21 12:13:46.182][Info][peer.go:309] [NCE-4] starting connection
[2021-01-21 12:13:46.188][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 2.413651ms, response size: 574 B
[2021-01-21 12:13:46.563][Info][peer.go:833] [APPL-1] objects created in: 380.475508ms
[2021-01-21 12:13:46.649][Info][peer.go:833] [NAG99] objects created in: 466.348782ms
[2021-01-21 12:13:47.099][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 56.770704ms, response size: 12.2KiB
[2021-01-21 12:13:47.120][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 5.507899ms, response size: 422 B
[2021-01-21 12:13:48.109][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 324.501ยตs, response size: 736 B
--
[2021-01-21 12:30:43.059][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 582.821125ms, response size: 5.4KiB
[2021-01-21 12:30:43.110][Info][main.go:680] got sigint, quitting
[2021-01-21 12:30:43.110][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:43.111][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 12:30:43.111][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:43.111][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 12:30:43.569][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:43.569][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:30:48.813][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:48.813][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:48.813][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:48.813][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:30:51.989][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:51.990][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:51.990][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:51.990][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:30:57.160][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:57.161][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:57.161][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:57.161][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:30:59.058][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:59.058][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:59.058][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:59.058][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:30:59.418][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:59.418][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:59.418][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:59.418][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:30:59.418][Info][peer.go:309] [NCE-2] starting connection
[2021-01-21 12:30:59.419][Info][peer.go:309] [NCE-3] starting connection
[2021-01-21 12:30:59.419][Info][peer.go:309] [NCE-4] starting connection
[2021-01-21 12:30:59.419][Info][peer.go:309] [Master] starting connection
--
[2021-01-21 12:44:30.524][Info][listener.go:165] incoming hostgroups request from @ to /var/cache/thruk/lmd/live.sock finished in 1.173094ms, response size: 777 B
[2021-01-21 12:44:31.617][Info][main.go:680] got sigint, quitting
[2021-01-21 12:44:31.617][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:31.617][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:31.617][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 12:44:31.617][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 12:44:31.901][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:31.901][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:44:37.175][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:37.175][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:37.176][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:37.176][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:44:41.280][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:41.280][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:41.280][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:41.280][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:44:41.348][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:41.348][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:41.349][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:41.349][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:44:41.648][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:41.649][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:41.649][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:41.649][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:44:42.046][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:42.047][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:42.047][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:42.047][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 12:44:47.295][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:47.295][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:47.296][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 12:44:47.296][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
--
[2021-01-21 14:06:51.286][Info][listener.go:165] incoming hostgroups request from @ to /var/cache/thruk/lmd/live.sock finished in 79.41176ms, response size: 777 B
[2021-01-21 14:06:51.444][Info][main.go:680] got sigint, quitting
[2021-01-21 14:06:51.444][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 14:06:51.444][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 14:06:51.444][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 14:06:51.444][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 14:06:52.007][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:06:52.008][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:06:52.008][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:06:52.008][Info][peer.go:309] [NCE-2] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [NCE-3] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [Master] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [NCE-1] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [NCE-4] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [CHI-1] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [NAG99] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [APPL-ChinaHUB] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [MED-1] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [MED-2] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [ASP-1] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [AME-1] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [MED-3] starting connection
[2021-01-21 14:06:52.008][Info][peer.go:309] [APPL-1] starting connection
[2021-01-21 14:06:52.411][Info][peer.go:833] [APPL-1] objects created in: 402.392444ms
[2021-01-21 14:06:52.484][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 397.113ยตs, response size: 654 B
[2021-01-21 14:06:52.486][Info][listener.go:165] incoming sites request from @ to /var/cache/thruk/lmd/live.sock finished in 422.794ยตs, response size: 2.0KiB
[2021-01-21 14:06:52.490][Info][peer.go:833] [NAG99] objects created in: 481.988963ms
[2021-01-21 14:06:53.209][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 160.116ยตs, response size: 80 B
[2021-01-21 14:06:53.718][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 73.337ยตs, response size: 80 B
[2021-01-21 14:06:53.949][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 289.36ยตs, response size: 90 B
[2021-01-21 14:06:53.954][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 117.435ยตs, response size: 422 B
[2021-01-21 14:06:54.112][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 87.819ยตs, response size: 80 B
--
[2021-01-21 14:34:51.223][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 246.162114ms, response size: 93 B
[2021-01-21 14:34:52.465][Info][main.go:680] got sigint, quitting
[2021-01-21 14:34:52.465][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 14:34:52.465][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 14:34:52.465][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 14:34:52.465][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 14:34:53.134][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:34:53.136][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 14:34:58.412][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:34:58.413][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:34:58.413][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:34:58.413][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:34:58.413][Info][peer.go:309] [NCE-1] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [AME-1] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [CHI-1] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [APPL-ChinaHUB] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [Master] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [NCE-2] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [NCE-3] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [MED-1] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [MED-3] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [MED-2] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [ASP-1] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [NCE-4] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [NAG99] starting connection
[2021-01-21 14:34:58.413][Info][peer.go:309] [APPL-1] starting connection
[2021-01-21 14:34:58.750][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 369.203ยตs, response size: 574 B
[2021-01-21 14:34:58.821][Info][peer.go:833] [APPL-1] objects created in: 408.118364ms
[2021-01-21 14:34:58.915][Info][peer.go:833] [NAG99] objects created in: 501.253344ms
[2021-01-21 14:34:58.975][Info][listener.go:165] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 214.266ยตs, response size: 659 B
[2021-01-21 14:34:58.976][Info][listener.go:165] incoming sites request from @ to /var/cache/thruk/lmd/live.sock finished in 286.74ยตs, response size: 2.0KiB
[2021-01-21 14:34:59.062][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 58.840885ms, response size: 12.2KiB
--
[2021-01-21 14:46:27.035][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 374.091229ms, response size: 97 B
[2021-01-21 14:46:27.096][Info][main.go:680] got sigint, quitting
[2021-01-21 14:46:27.096][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:27.096][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:27.096][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 14:46:27.096][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 14:46:27.429][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:27.429][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 14:46:32.449][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:32.449][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:32.449][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:32.449][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 14:46:35.818][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:35.818][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:35.819][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:35.819][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 14:46:35.948][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:35.948][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:35.948][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:35.948][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:46:35.949][Info][peer.go:309] [MED-2] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [MED-3] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [CHI-1] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [NCE-2] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [MED-1] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [NCE-4] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [NAG99] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [APPL-ChinaHUB] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [NCE-1] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [APPL-1] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [AME-1] starting connection
[2021-01-21 14:46:35.949][Info][peer.go:309] [Master] starting connection
--
[2021-01-21 14:57:50.877][Info][listener.go:165] incoming contactgroups request from @ to /var/cache/thruk/lmd/live.sock finished in 56.707555ms, response size: 528.1KiB
[2021-01-21 14:57:50.978][Info][main.go:680] got sigint, quitting
[2021-01-21 14:57:50.978][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 14:57:50.978][Info][listener.go:311] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2021-01-21 14:57:50.978][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 14:57:50.978][Info][listener.go:324] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2021-01-21 14:57:51.091][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:57:51.091][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 14:57:56.189][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:57:56.189][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:57:56.190][Fatal][listener.go:297] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use
[2021-01-21 14:57:59.456][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:57:59.456][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-21 14:57:59.456][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:57:59.457][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-21 14:57:59.457][Info][peer.go:309] [ASP-1] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [NAG99] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [MED-1] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [MED-2] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [AME-1] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [MED-3] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [NCE-2] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [NCE-4] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [Master] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [NCE-1] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [CHI-1] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [APPL-ChinaHUB] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [NCE-3] starting connection
[2021-01-21 14:57:59.457][Info][peer.go:309] [APPL-1] starting connection
[2021-01-21 14:57:59.881][Info][peer.go:833] [APPL-1] objects created in: 423.579211ms
[2021-01-21 14:57:59.978][Info][peer.go:833] [NAG99] objects created in: 520.227145ms
[2021-01-21 14:58:00.551][Info][listener.go:165] incoming comments request from @ to /var/cache/thruk/lmd/live.sock finished in 564.769ยตs, response size: 3 B

Invalid request 'Filter: is_executing = 1'

I installed Thruk & LMD a couple years ago and it has been working very well. Recently I upgraded Thruk to use the new API support and that went without a hitch. I encountered some problems when upgrading LMD.

For background, all servers run RHEL 7 or CentOS 7 with recent patches. The Thruk server connects to multiple Nagios servers at remote sites using stunnel (followed https://www.thruk.org/documentation/install.html#_tls-livestatus
). All Nagios instances are 4.4.6 and use check-mk-livestatus-1.4.0p31 (RPMs from the EPEL repo).

Before upgrading Thruk & LMD:

  • Thruk 2.20
  • LMD 1.3.0 (extracted from omd 2.7.0)
  • No errors in /var/cache/thruk/lmd/lmd.log on the Thruk server or in /var/log/nagios/livestatus.log on Nagios servers.

After upgrading Thruk & before upgrading LMD:

  • Thruk 2.46
  • LMD 1.3.0 (extracted from omd 2.7.0)
  • /var/cache/thruk/lmd/lmd.log on the Thruk server started showing this error repeatedly:
[2021-11-12 09:13:21][Warn][filter.go:609] not implemented op: 7

I thought the new version of Thruk may require a newer version of LMD, that's part of what led me to attempt upgrading LMD.

After upgrading LMD:

  • LMD 2.0.3 (extracted from omd 4.40)
  • No errors in /var/cache/thruk/lmd/lmd.log on the Thruk server.
  • /var/log/nagios/livestatus.log on the Nagios servers started showing these errors repeatedly:
2021-11-12 14:49:18 [client 5] Invalid request 'Filter: is_executing = 1'
2021-11-12 14:49:18 [client 5] error: Invalid request method

So it seems that upgrading LMD resolved the not implemented op: 7 problem, but introduced a new Filter: is_executing problem. I captured some livestatus queries that Thruk sends to Nagios and they run fine when executing them manually. I'm stumped as to why they all succeed when run by hand but sometimes fail when run by Thruk. The stunnel connections are fine (they use the exact same settings from the Thruk documentation).

As an exercise in trial & error, I tried all versions of LMD that shipped with major OMD releases between 2.7.0 and 4.4.0. It's pretty clear that the errors are tied to the versions since they come & go cleanly when the software is upgrade/downgraded.

Thruk Version LMD Version OMD Release lmd.log livestatus.log
2.2.0 1.3.0 2.7.0 No errors No errors
2.46 1.8.2 3.30 No errors No errors
2.46 1.9.2 3.40 Broken Broken
2.46 2.0.1 4.20 No errors Invalid request method
2.46 2.0.3 4.40 No errors Invalid request method

Note that LMD 1.9.2 didn't work at all. The stunnel connections showed peer is down: tls: server selected unsupported protocol version 301 errors for some reason. Upgrading/downgrading fixed that problem, only 1.9.2 was affected for some reason.

A guess is that the "Invalid request method" problem noted in livestatus.log started with LMD 2.x since that was a major version bump.

I looked into upgrading the check-mk-livestatus Nagios module but it appears that isn't distributed independently anymore (only available as source code or bundled with the main checkmk release). I also tried replacing check-mk-livestatus with naemon-livestatus, but Nagios wouldn't load the module (nagios[117094]: Error: Module '/usr/lib64/naemon/naemon-livestatus/livestatus.so' is using an old or unspecified version of the event broker API. Module will be unloaded.). While I could probably replace the whole Nagios app with Naemon, that would be a lot of work which I'd rather not tackle at this time.

Do you know what's going on or what other steps can be taken to gather more info? I'd like to get LMD 2.0.3 working with my Nagios 4.4.6 instances.

Thanks.

Timezone in logs

I'm feeding the lmd logs to an external log parser, and having the timezone of the log entry would be a welcome addition.

On our systems, we use lmd embedded in Thruk. The server timezone is UTC, Thruk itself has a specific timezone in the configuration file.
This has the resulting effect that :

  • lmd log files have a time expressed in local time (the one configured in Thruk configuration file)
  • while Thruk logs have a time expressed in UTC

Example of log entries differing only by a few seconds, local time is UTC+2:

[2021/04/28 14:38:02][thruk-dev][ERROR][thruk.log] FAILED - to load command module: logcache.
[2021-04-28 16:38:51.202][Info][listener.go:152] incoming status request from @ to /var/cache/thruk/lmd/live.sock finished in 99.844ยตs, response size: 196 B

(you can see the 2 hours (and few seconds) difference between the 2 lines, difference explained by the fact that one time is expressed in UTC, the other in UTC+2)

As there is no timezone specifier in lmd log files (nor in Thruk log files but it's another issue), it's not always easy to tell the external log parser what to do.

This issue is about adding:

  • either, in a systematic way, the timezone to the logs (and, specifically, the timezone offset expressed like "-0700" without any separating colons) - by patching logging.go here (which may introduce compatibility issues for users already parsing the date/time)
  • or having additional configuration option to specify the format for date/time in the logs

If this issues is accepted, and if you would like that I provide a PR, please provide guidance as which solution could be implemented, and if there are some conventions for the configuration parameters naming etc...

Thanks for your feedback.

Disabled Active Checks icon not shown for Icinga

Icinga 2.8.1
Thruk 2.18
LMD 1.2

Not seeing the Icon indicating that active checks are disabled for services. For the host, the icon shows that says host is passive checked.

/thruk/#cgi-bin/status.cgi

image

The correct status is shown on the service page.

/thruk/#cgi-bin/extinfo.cgi?
image

LMD Panic interface {} is bool, not float64

Hi,

I installed Thruk, Icinga2 and LMD on fresh Ubuntu 1804 and get this Error at every Request:

[2018-07-27 19:37:26][Error][peer.go:2462] [Icinga2] LastResponse:
[2018-07-27 19:37:26][Error][peer.go:2463] [Icinga2] [[1532713039.0,0.0,false,1.0,1.0], [1532713033.0,0.0,false,1.0,1.0], [1532713032.0,0.0,false,1.0,1.0], [1532712760.0,0.0,false,1.0,1.0], [1532713006.0,0.0,false,1.0,1.0], [1532712949.0,0.0,false,1.0,1.0], [1532712979.0,0.0,false,1.0,1.0], [1532713034.0,0.0,false,1.0,1.0], [1532712980.0,0.0,false,1.0,1.0], [1532713003.0,0.0,false,1.0,1.0], [1532713018.0,0.0,false,1.0,1.0], [1532712986.0,0.0,false,1.0,1.0], [1532712981.0,0.0,false,1.0,1.0], [1532713033.0,0.0,false,1.0,1.0], [1532713023.0,0.0,false,1.0,1.0], [1532713037.0,0.0,false,1.0,1.0], [1532713018.0,0.0,false,1.0,1.0], [1532713039.0,0.0,false,1.0,1.0], [1532712986.0,0.0,false,1.0,1.0], [1532713030.0,0.0,false,1.0,1.0], [1532712986.0,0.0,false,1.0,1.0], [1532712944.0,0.0,false,1.0,1.0], [1532712948.0,0.0,false,1.0,1.0], [1532713042.0,0.0,false,1.0,1.0], [1532712996.0,0.0,false,1.0,1.0], [1532712969.0,0.0,false,1.0,1.0], [1532712951.0,0.0,false,1.0,0.0], [1532713003.0,0.0,false,1.0,1.0], [1532713042.0,0.0,false,1.0,1.0]]
[2018-07-27 19:37:58][Warn][listener.go:218] removing stale socket: /var/cache/thruk/lmd/live.sock
[2018-07-27 19:37:58][Warn][listener.go:218] removing stale socket: /var/cache/thruk/lmd/live.sock
[2018-07-27 19:38:03][Error][peer.go:2457] [Icinga2] Panic: interface conversion: interface {} is bool, not float64
[2018-07-27 19:38:03][Error][peer.go:2458] [Icinga2] goroutine 36 [running]:
runtime/debug.Stack(0xc42019e090, 0x90f0eb, 0xe)
	/usr/lib/go-1.10/src/runtime/debug/stack.go:24 +0xa7
main.logPanicExitPeer(0xc4202260c0)
	/opt/local/go/src/github.com/sni/lmd/lmd/peer.go:2458 +0x12f
panic(0x8861a0, 0xc420450f00)
	/usr/lib/go-1.10/src/runtime/panic.go:502 +0x229
main.(*Peer).getMissingTimestamps(0xc4202260c0, 0xc42013cc30, 0xc420320780, 0xc42005fa78, 0xc420572e10, 0x5, 0x5, 0xc420226000, 0x5b5b5878, 0x7600000001)
	/opt/local/go/src/github.com/sni/lmd/lmd/peer.go:894 +0x67e
main.(*Peer).UpdateDeltaTableFullScan(0xc4202260c0, 0xc42013cc30, 0xc4204a9dc0, 0x40, 0x1, 0xc4204a9dc0, 0x40)
	/opt/local/go/src/github.com/sni/lmd/lmd/peer.go:848 +0x417
main.(*Peer).UpdateDeltaTableHosts(0xc4202260c0, 0xc4204a9dc0, 0x40, 0x0, 0x0)
	/opt/local/go/src/github.com/sni/lmd/lmd/peer.go:712 +0x86d
main.(*Peer).UpdateDeltaTables(0xc4202260c0, 0x90cbd7)
	/opt/local/go/src/github.com/sni/lmd/lmd/peer.go:665 +0xac4
main.(*Peer).periodicUpdate(0xc4202260c0, 0xc42005fe76, 0xc42005fe78)
	/opt/local/go/src/github.com/sni/lmd/lmd/peer.go:394 +0x370
main.(*Peer).updateLoop(0xc4202260c0)
	/opt/local/go/src/github.com/sni/lmd/lmd/peer.go:325 +0x2da
main.(*Peer).Start.func1(0xc4202260c0)
	/opt/local/go/src/github.com/sni/lmd/lmd/peer.go:226 +0x59
created by main.(*Peer).Start
	/opt/local/go/src/github.com/sni/lmd/lmd/peer.go:223 +0x17f
[2018-07-27 19:38:03][Error][peer.go:2460] [Icinga2] LastQuery:
[2018-07-27 19:38:03][Error][peer.go:2461] [Icinga2] GET hosts
ResponseHeader: fixed16
OutputFormat: json
Columns: last_check scheduled_downtime_depth acknowledged active_checks_enabled notifications_enabled

Thruk 2.22
Icinga r2.9.0-1
LMD 1.3.3

/var/cache/thruk/lmd/lmd.ini:

Listen = ['/var/cache/thruk/lmd/live.sock']

LogFile = '/var/cache/thruk/lmd/lmd.log'

LogLevel = 'Warn'

[[Connections]]
name    = 'Icinga2'
id      = '0ac2d'
source  = ['/var/run/icinga2/cmd/livestatus']

/etc/thruk/thruk_local.d/lmd.conf:

use_lmd_core=1
lmd_core_bin=/opt/local/go/bin/lmd
lmd_core_config = /var/cache/thruk/lmd/lmd.ini

/etc/thruk/thruk_local.conf:

<Component Thruk::Backend>
    <peer>
        name    = Icinga2
        id      = 0ac2d
        type    = livestatus
        <options>
            peer          = /var/run/icinga2/cmd/livestatus
        </options>
    </peer>
</Component>

LMD not initializing...

Hi Sven,

We have multiple naemon instances running and I am trying to configure lmd in place of Shadownaemon to fetch host/service status from remote naemon instances via livestatus to thruk. After apache was restarted, thruk log show "**lmd not running, starting up...**".

Note: I followed steps as suggested in lmd readme

  • On apache restart, lmd.ini file created successfully with correct connection details
Listen = ['/var/cache/thruk/lmd/live.sock']
LogFile = '/var/cache/thruk/lmd/lmd.log'
LogLevel = 'Warn'
[[Connections]]
name   = 'Managed'
id     = '0324d'
source = ['1.2.3.4:6557']
  • Errors in Thruk Log:
    [ERROR][Thruk] Managed:  (10.105.75.21:6557)
    [ERROR][Thruk] lmd not running, starting up...
    [ERROR][Thruk] No Backend available
  • Creation of live.sock and lmd.log didn't happen and following is the output for thruk -a listhosts
# thruk -a listhosts
[15:09:36,591][ERROR][Thruk] lmd not running, starting up...
[15:09:37,595][ERROR][Thruk] lmd not running, starting up...
failed to open socket /var/cache/thruk/lmd/live.sock: No such file or directory at /usr/share/thruk/lib/Monitoring/Livestatus/Class/Lite.pm line 378.
 at /usr/share/thruk/lib/Monitoring/Livestatus/Class/Lite.pm line 378.
  • File permissions /var/cache/thruk/lmd
      drwxrwx--- 2 apache apache 4096 Jul 27 12:17 lmd

Steps followed in configuring lmd:

  • Installed thruk on RHEL7 box
  • Installed golang and then installed lmd
  • Added following in /etc/thruk/thruk_local.d
    use_lmd_core=1
    lmd_core_bin=/opt/local/go/bin/lmd
  • Restarted apache.

Environment Details:

OS:  Red Hat Enterprise Linux Server release 7.1 (Maipo)
Naemon Version:  1.0.6-1
Thruk Version:  2.14~2

LMD takes too long to create cache

Hey,

I have a naemon 1.0.8 + livestatus running with 1500 hosts and 32k services. Livestatus is fast, the interface via livestatus is very fast, but LMD takes 8m to read all objects on startup.
Both the interface and LMD are using livestatus via xinetd.
Any tips on how I could debug this or improve its performance?

Tks.

Thruk/LMD: The LMD process sometimes disappears without a trace

Having the #108 issue explained (thanks! :-)), next up for us is another similar issue, where the LMD process appears to just vanish without a trace.

It is almost like the LMD process is being hard killed and then restarted by Thruk - except, no trace of this happening is left in the thruk.log file.

The pattern in the log file goes like this:

  • regular, ordinary messages
  • suddenly: "[Debug][main.go:880] command line arguments:"

The last message being the ordinary LMD startup message.

The startup message happens just split seconds after the latest entry of the ordinary activity, so the LMD process appears to be restarted blazingly fast.

Any ideas what might be the cause of this pattern?

[root@dkrdswppthrukp01 ~]# ps -ef | grep lmd
apache   32493     1 99 02:38 ?        05:04:57 /usr/local/bin/lmd -pidfile /var/cache/thruk/lmd/pid -config /var/cache/thruk/lmd/lmd.ini -config /etc/thruk/thruk_local.d/lmd.ini
root     22203  4843  0 04:40 pts/2    00:00:00 grep lmd
[root@dkrdswppthrukp01 ~]# 

# cat /var/log/lmd.log | egrep '(02\:38\:|02\:37\:|02\:36\:)' | grep -v "request.go:313"
[2021-01-22 02:38:48.236][Debug][listener.go:75] incoming request from: @ to /var/cache/thruk/lmd/live.sock
[2021-01-22 02:38:48.236][Debug][request.go:291] request: GET services
[2021-01-22 02:38:48.338][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 98.803725ms, response size: 90 B
[2021-01-22 02:38:48.342][Debug][listener.go:75] incoming request from: @ to /var/cache/thruk/lmd/live.sock
[2021-01-22 02:38:48.342][Debug][request.go:291] request: GET hosts
[2021-01-22 02:38:48.791][Debug][main.go:880] command line arguments:
[2021-01-22 02:38:48.792][Debug][main.go:882] args: [
[2021-01-22 02:38:48.792][Debug][main.go:882] args:   "/usr/local/bin/lmd",
[2021-01-22 02:38:48.792][Debug][main.go:882] args:   "-pidfile",
[2021-01-22 02:38:48.792][Debug][main.go:882] args:   "/var/cache/thruk/lmd/pid",
[2021-01-22 02:38:48.792][Debug][main.go:882] args:   "-config",
[2021-01-22 02:38:48.792][Debug][main.go:882] args:   "/var/cache/thruk/lmd/lmd.ini",
[2021-01-22 02:38:48.792][Debug][main.go:882] args:   "-config",
...
[2021-01-22 02:38:48.795][Debug][main.go:889] conf:   "CompressionMinimumSize": 500,
[2021-01-22 02:38:48.795][Debug][main.go:889] conf:   "CompressionLevel": -1,
[2021-01-22 02:38:48.795][Debug][main.go:889] conf:   "MaxClockDelta": 0,
[2021-01-22 02:38:48.795][Debug][main.go:889] conf:   "UpdateOffset": 3
[2021-01-22 02:38:48.795][Debug][main.go:889] conf: }
[2021-01-22 02:38:48.796][Warn][listener.go:275] removing stale socket: /var/cache/thruk/lmd/live.sock
[2021-01-22 02:38:48.797][Info][listener.go:305] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2021-01-22 02:38:48.797][Info][listener.go:305] listening for incoming queries on tcp :3333
[2021-01-22 02:38:48.797][Info][peer.go:309] [NCE-3] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [MED-2] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [APPL-1] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [APPL-ChinaHUB] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [Master] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [NCE-1] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [MED-1] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [ASP-1] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [MED-3] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [NAG99] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [NCE-2] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [NCE-4] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [CHI-1] starting connection
[2021-01-22 02:38:48.797][Info][peer.go:309] [AME-1] starting connection
[2021-01-22 02:38:48.799][Debug][datastoreset.go:93] [Master] updated table:          status - fetch:       0s - insert:
 0s - count:        1 - size:        0 kB
[2021-01-22 02:38:48.800][Debug][datastoreset.go:93] [AME-1] updated table:          status - fetch:      1ms - insert:
0s - count:        1 - size:        0 kB
[2021-01-22 02:38:48.800][Debug][peer.go:1059] [Master] LastQuery:
[2021-01-22 02:38:48.800][Debug][peer.go:1060] [Master] GET columns
[2021-01-22 02:38:48.800][Debug][peer.go:907] [Master] sending data/query failed: [Master] bad response code: 404 - Table 'col
umns' does not exist.
[2021-01-22 02:38:48.800][Debug][datastoreset.go:93] [NCE-3] updated table:          status - fetch:      1ms - insert:
0s - count:        1 - size:        0 kB
[2021-01-22 02:38:48.800][Debug][datastoreset.go:93] [NCE-1] updated table:          status - fetch:      2ms - insert:
0s - count:        1 - size:        0 kB


[root@dkrdswppthrukp01 ~]# ps -ef | grep lmd
apache    8139     1 99 09:04 ?        00:28:17 /usr/local/bin/lmd -pidfile /var/cache/thruk/lmd/pid -config /var/cache/thruk/lmd/lmd.ini -config /etc/thruk/thruk_local.d/lmd.ini
root     10055  4843  0 09:12 pts/2    00:00:00 grep lmd
[root@dkrdswppthrukp01 ~]#

[2021-01-22 09:04:35.131][Debug][datastoreset.go:300] [ASP-1] updated table:           hosts - fetch:    111ms - insert:       0s - count:       12 - size:        6 kB
[2021-01-22 09:04:35.143][Debug][datastoreset.go:300] [ASP-1] updated table:        services - fetch:     10ms - insert:       0s - count:       15 - size:        5 kB
[2021-01-22 09:04:35.151][Debug][datastoreset.go:199] [ASP-1] delta update complete in: 134.949993ms
[2021-01-22 09:04:35.244][Debug][datastoreset.go:300] [NCE-4] updated table:           hosts - fetch:     32ms - insert:    276ms - count:        1 - size:        0 kB
[2021-01-22 09:04:35.795][Debug][main.go:880] command line arguments:

[root@dkrdswppthrukp01 ~]# ps -ef | grep lmd
apache   22269     1 99 10:06 ?        11:09:18 /usr/local/bin/lmd -pidfile /var/cache/thruk/lmd/pid -config /var/cache/thruk/lmd/lmd.ini -config /etc/thruk/thruk_local.d/lmd.ini
root     15810  4843  0 12:06 pts/2    00:00:00 grep lmd
[root@dkrdswppthrukp01 ~]#

[2021-01-22 10:06:57.340][Debug][request.go:313] request: ResponseHeader: fixed16
[2021-01-22 10:06:57.415][Debug][rawresultset.go:39] sorting result took 52.417ยตs
[2021-01-22 10:06:57.418][Info][listener.go:165] incoming services request from @ to /var/cache/thruk/lmd/live.sock finished in 77.826943ms, response size: 281.2KiB
[2021-01-22 10:06:57.493][Debug][datastoreset.go:708] [AME-1] updated table:          status - fetch:      1ms - insert:       0s - count:        1 - size:        0 kB
[2021-01-22 10:06:58.207][Debug][main.go:880] command line arguments:


[root@dkrdswppthrukp01 ~]# ps -ef | grep lmd
apache   16223     1 99 12:08 ?        00:01:00 /usr/local/bin/lmd -pidfile /var/cache/thruk/lmd/pid -config /var/cache/thruk/lmd/lmd.ini -config /etc/thruk/thruk_local.d/lmd.ini
root     16635  4843  0 12:09 pts/2    00:00:00 grep lmd
[root@dkrdswppthrukp01 ~]# 

[2021-01-22 12:08:51.981][Debug][datastoreset.go:300] [NCE-4] updated table:        services - fetch:     25ms - insert:       0s - count:        1 - size:        0 kB
[2021-01-22 12:08:51.984][Debug][datastoreset.go:300] [NCE-4] updated table:        services - fetch:     33ms - insert:       0s - count:        1 - size:        0 kB
[2021-01-22 12:08:51.984][Debug][datastoreset.go:300] [NCE-4] updated table:        services - fetch:     23ms - insert:       0s - count:        1 - size:        0 kB
[2021-01-22 12:08:51.986][Debug][datastoreset.go:300] [NCE-4] updated table:        services - fetch:     35ms - insert:       0s - count:        1 - size:        0 kB
[2021-01-22 12:08:52.294][Debug][main.go:880] command line arguments:

[root@dkrdswppthrukp01 ~]# ps -ef | grep lmd
apache   17230     1 99 12:11 ?        00:23:47 /usr/local/bin/lmd -pidfile /var/cache/thruk/lmd/pid -config /var/cache/thruk/lmd/lmd.ini -config /etc/thruk/thruk_local.d/lmd.ini
root     18440  4843  0 12:18 pts/2    00:00:00 grep lmd
[root@dkrdswppthrukp01 ~]# 

[2021-01-22 12:11:29.607][Debug][request.go:313] request: Columns: can_submit_commands alias email peer_key
[2021-01-22 12:11:29.607][Debug][request.go:313] request: Filter: name = yaylu
[2021-01-22 12:11:29.607][Debug][request.go:313] request: OutputFormat: json
[2021-01-22 12:11:29.607][Debug][request.go:313] request: ResponseHeader: fixed16
[2021-01-22 12:11:29.929][Debug][main.go:880] command line arguments:
[root@dkrdswppthrukp01 ~]#

Looking at this from the Thruk side of things, it doesn't look like Thruk is the cause of this.

The log doesn't mention that Thruk is killing the LMD process, or even restarting it.

[2021/01/22 12:05:54][dkrdswppthrukp01.vestasext.net][WARN] lmd not responding while trying to contact it (waitpid, WNOHANG). Try #2 result: $rc: 0, $rc_shifted: 0
[2021/01/22 12:06:32][dkrdswppthrukp01.vestasext.net][WARN] lmd not responding while trying to contact it (waitpid, WNOHANG). Try #2 result: $rc: 0, $rc_shifted: 0
[2021/01/22 12:07:20][dkrdswppthrukp01.vestasext.net][WARN] lmd not responding while trying to contact it (waitpid, WNOHANG). Try #2 result: $rc: 0, $rc_shifted: 0
[2021/01/22 12:07:32][dkrdswppthrukp01.vestasext.net][WARN] lmd not responding while trying to contact it (waitpid, WNOHANG). Try #2 result: $rc: 0, $rc_shifted: 0
[2021/01/22 12:08:32][dkrdswppthrukp01.vestasext.net][WARN] lmd not responding while trying to contact it (waitpid, WNOHANG). Try #2 result: $rc: 0, $rc_shifted: 0
[2021/01/22 12:08:53][dkrdswppthrukp01.vestasext.net][WARN] lmd not responding while trying to contact it (waitpid, WNOHANG). Try #2 result: $rc: 0, $rc_shifted: 0
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] internal lmd error - did not get a valid response for at least any site at /usr/share/thruk/lib/Thruk/Backend/Manager.pm line 1880.
[2021/01/22 12:19:25][dkrdswppthrukp01.vestasext.net][WARN] lmd not responding while trying to contact it (waitpid, WNOHANG). Try #2 result: $rc: 0, $rc_shifted: 0

The Thruk log in more detail.

From losing connection to LMD, to the LMD having been restarted, to all backends up again.

# tail -20000 /var/log/thruk/thruk.log | egrep -v '^\||^\+' 
[2021/01/22 12:08:31][dkrdswppthrukp01.vestasext.net][WARN] params:  {'active_downtimes' => 'on','ahas' => '','broadcast_notification' => '','childoptions' => '0','cmd_mod' => 2,'cmd_typ' => 7,'com_author' => 'Leanne Mae Dayao','com_data' => '','com_id' => 0,'down_id' => 0,'end_time' => '2021-01-22 14:05:00','expir...
[2021/01/22 12:08:31][dkrdswppthrukp01.vestasext.net][WARN] user:    laeda
[2021/01/22 12:08:31][dkrdswppthrukp01.vestasext.net][WARN] address: 10.219.226.231
[2021/01/22 12:08:31][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:08:32][dkrdswppthrukp01.vestasext.net][WARN] lmd not responding while trying to contact it (waitpid, WNOHANG). Try #2 result: $rc: 0, $rc_shifted: 0                                                                                        [2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 102.1s to load.
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/cmd.cgi
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] params:  {'active_downtimes' => 'on','ahas' => '','broadcast_notification' => '','childoptions' => '0','cmd_mod' => 2,'cmd_typ' => 7,'com_author' => 'John Carl Marzan','com_data' => '','com_id' => 0,'down_id' => 0,'end_time' => '2021-01-22 14:06:00','expir...
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] user:    jalmr
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] address: 10.219.224.109
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 75.3s to load.
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/cmd.cgi
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] params:  {'active_downtimes' => 'on','ahas' => '','broadcast_notification' => '','childoptions' => '0','cmd_mod' => 2,'cmd_typ' => 7,'com_author' => 'Leanne Mae Dayao','com_data' => '','com_id' => 0,'down_id' => 0,'end_time' => '2021-01-22 14:07:00','expir...
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] user:    laeda
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] address: 10.219.226.231
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 120.8s to load.
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/cmd.cgi
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] params:  {'active_downtimes' => 'on','ahas' => '','broadcast_notification' => '','childoptions' => '0','cmd_mod' => 2,'cmd_typ' => 7,'com_author' => 'Leanne Mae Dayao','com_data' => '','com_id' => 0,'down_id' => 0,'end_time' => '2021-01-22 14:05:00','expir...
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] user:    laeda
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] address: 10.219.226.231
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?hostgroup=all&style=summary&_=1611313729292
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611313729292','hostgroup' => 'all','style' => 'summary'};
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] user:    jfgda
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.10.61.22
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] CHI-1: peer is down: connecting... (172.26.66.63:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NAG99: peer is down: connecting... (172.26.66.12:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] APPL-1: peer is down: connecting... (10.0.53.142:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] APPL-ChinaHUB: peer is down: connecting... (10.78.8.134:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] did not get a valid response for at least any site at /usr/share/thruk/lib/Thruk/Backend/Manager.pm line 1880.
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?backends=51656&backends=4120b&backends=a103e&backends=53dec&backends=d4c78&backends=bc29d&backends=51d28&backends=bde22&backends=90ba2&backends=5f10e&backends=a690a&backends=4120b&backends=a103e&backends=53dec&backends=d4c78&backends=bc29d&backends=51d28&backends=bde22&backends=90ba2&backends=5f10e&backends=a690a&dfl_s0_hostprops=262186&dfl_s0_hoststatustypes=2&dfl_s0_op==&dfl_s0_op=~&dfl_s0_op=!=&dfl_s0_serviceprops=262186&dfl_s0_servicestatustypes=16&dfl_s0_type=custom+variable&dfl_s0_type=host&dfl_s0_type=service&dfl_s0_val_pre=OPRSTATUS&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_value=InService&dfl_s0_value=PPC-MAIN&dfl_s0_value=ppc_time&dfl_s1_hostprops=262186&dfl_s1_hoststatustypes=2&dfl_s1_op=~&dfl_s1_op=!=&dfl_s1_op=!=&dfl_s1_op=!=&dfl_s1_op=!~&dfl_s1_op=!=&dfl_s1_op=!=&dfl_s1_op=!=&dfl_s1_op=!=&dfl_s1_op=!=&dfl_s1_op==&dfl_s1_op=!~&dfl_s1_serviceprops=262186&dfl_s1_servicestatustypes=16&dfl_s1_type=host&dfl_s1_type=service&dfl_s1_type=service&dfl_s1_type=service&dfl_s1_type=plugin+output&dfl_s1_type=service&dfl_s1_type=service&dfl_s1_type=service&dfl_s1_type=service&dfl_s1_type=service&dfl_s1_type=custom+variable&dfl_s1_type=service&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=OPRSTATUS&dfl_s1_val_pre=&dfl_s1_value=PPC-Log&dfl_s1_value=wsus_updates&dfl_s1_value=eventlog_Windows_System&dfl_s1_value=eventlog_Puppet_Agent&dfl_s1_value=unlock_puppet&dfl_s1_value=task_unlock_puppet_agent&dfl_s1_value=process_mem_3rd_MongoDB&dfl_s1_value=ntp_timesync&dfl_s1_value=Connectivity&dfl_s1_value=connectivity_RDP&dfl_s1_value=InService&dfl_s1_value=disk_c&scrollTo=0
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'backends' => ['51656','4120b','a103e','53dec','d4c78','bc29d','51d28','bde22','90ba2','5f10e','a690a','4120b','a103e','53dec','d4c78','bc29d','51d28','bde22','90ba2','5f10e','a690a'],'dfl_s0_hostprops' => '262186','dfl_s0_hoststatustypes' => '2'...
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] user:    laeda
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.226.231
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] APPL-1: peer is down: connecting... (10.0.53.142:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?host=DE-Wildenberg-PPC-MAIN&scrollTo=0
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'host' => 'DE-Wildenberg-PPC-MAIN','scrollTo' => '0'};
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] user:    laeda
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.226.231
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:52][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:53][dkrdswppthrukp01.vestasext.net][WARN] lmd not responding while trying to contact it (waitpid, WNOHANG). Try #2 result: $rc: 0, $rc_shifted: 0                                                                                        [2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 48.9s to load.
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/cmd.cgi
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 62.6s to load.
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/cmd.cgi
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] params:  {'active_downtimes' => 'on','ahas' => '','broadcast_notification' => '','childoptions' => '0','cmd_mod' => 2,'cmd_typ' => 34,'com_author' => 'John Carl Marzan','com_data' => '309979205','com_id' => 0,'down_id' => 0,'end_time' => '2021-01-22 14:06:...
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] user:    jalmr
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] address: 10.219.224.109
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] params:  {'active_downtimes' => 'on','ahas' => '','broadcast_notification' => '','childoptions' => '0','cmd_mod' => 2,'cmd_typ' => 7,'com_author' => 'Leanne Mae Dayao','com_data' => '','com_id' => 0,'down_id' => 0,'end_time' => '2021-01-22 14:07:00','expir...
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] user:    laeda
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] address: 10.219.226.231
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 55.6s to load.
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/cmd.cgi
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] params:  {'active_downtimes' => 'on','ahas' => '','broadcast_notification' => '','childoptions' => '0','cmd_mod' => 2,'cmd_typ' => 7,'com_author' => 'Leanne Mae Dayao','com_data' => '','com_id' => 0,'down_id' => 0,'end_time' => '2021-01-22 14:07:00','expir...
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] user:    laeda
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] address: 10.219.226.231
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?host=DE-SchneebergerhofWindhuebel-PPC-MAIN&scrollTo=0
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'host' => 'DE-SchneebergerhofWindhuebel-PPC-MAIN','scrollTo' => '0'};
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] user:    laeda
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.226.231
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:54][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?nav=&hidesearch=0&hidetop=&dfl_columns=host_name,description,state,last_check,duration,plugin_output,cust_OPRSTATUS,host_address,cust_HOSTNETWORK,cust_CUSTCATEGORY&style=detail&dfl_s0_type=host&dfl_s0_type=service&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_op=~&dfl_s0_op==&dfl_s0_value=-WAN1&dfl_s0_value=ntp_timesync&sortoption=9&sorttype=2&reload_nav=1&scrollTo=400&_=1611313733465
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611313733465','dfl_columns' => 'host_name,description,state,last_check,duration,plugin_output,cust_OPRSTATUS,host_address,cust_HOSTNETWORK,cust_CUSTCATEGORY','dfl_s0_op' => ['~','='],'dfl_s0_type' => ['host','service'],'dfl_s0_val_pre' =...
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] user:    chj
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.10.48.68
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?host=DE-Schluechtern-PPC-MAIN&scrollTo=0
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'host' => 'DE-Schluechtern-PPC-MAIN','scrollTo' => '0'};
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] user:    laeda
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.226.231
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:55][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/parts.cgi?part=_host_downtimes&host=DE-TennenbronnVOG-VOC1-SRV&backend=d4c78
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'backend' => 'd4c78','host' => 'DE-TennenbronnVOG-VOC1-SRV','part' => '_host_downtimes'};
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] user:    jalmr
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.224.109
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?host=TH-Romklao-PPC-MAIN
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'host' => 'TH-Romklao-PPC-MAIN'};
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] user:    laeda
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.226.231
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:56][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?nav=&hidesearch=0&hidetop=&dfl_columns=host_name,description,state,last_check,duration,plugin_output,cust_OPRSTATUS,host_address,cust_HOSTNETWORK,cust_CUSTCATEGORY&style=detail&update.x=11&update.y=6&dfl_s0_hoststatustype=2&dfl_s0_hoststatustype=4&dfl_s0_hoststatustype=8&dfl_s0_hoststatustype=1&dfl_s0_type=custom variable&dfl_s0_type=service&dfl_s0_type=custom variable&dfl_s0_type=custom variable&dfl_s0_val_pre=OPRSTATUS&dfl_s0_val_pre=&dfl_s0_val_pre=REGION&dfl_s0_val_pre=COUNTRY&dfl_s0_op==&dfl_s0_op==&dfl_s0_op==&dfl_s0_op==&dfl_s0_value=InService&dfl_s0_value=wsus_updates&dfl_s0_value=NCE&dfl_s0_value=RU&_=1611313736658
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611313736658','dfl_columns' => 'host_name,description,state,last_check,duration,plugin_output,cust_OPRSTATUS,host_address,cust_HOSTNETWORK,cust_CUSTCATEGORY','dfl_s0_hoststatustype' => ['2','4','8','1'],'dfl_s0_op' => ['=','=','=','='],'...
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] user:    chj
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.10.48.68
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:57][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?host=DE-Rossau-PPC-MAIN
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'host' => 'DE-Rossau-PPC-MAIN'};
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] user:    laeda
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.226.231
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:08:59][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/panorama.cgi?task=dashboard_save_states
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'current_tab' => 'tabpan-tab_96','nr' => 'tabpan-tab_96','states' => '{"tabpan-tab_96_panlet_24":{"state":3},"tabpan-tab_96_panlet_12":{"state":3},"tabpan-tab_96_panlet_14":{},"tabpan-tab_96_panlet_2":{"state":3},"tabpan-tab_96_panlet_23":{},"tab...
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] user:    laeda
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.226.231
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:09:00][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/panorama.cgi?task=status
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'backends' => ['a103e','5f10e','53dec','bde22','4120b','51656','bc29d','51d28','d4c78','a690a'],'current_tab' => 'tabpan-tab_269','reschedule' => '','state_type' => 'hard','task' => 'status','types' => '{"filter":{"[\\"on\\",null,\\"[{\\\\\\"host...
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] user:    chj
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.10.48.68
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/extinfo.cgi?type=2&host=DK-Oesterild-WAN1&service=check_cdp_neighbors&backend=a690a&scrollTo=1483&_=1611313743852
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611313743852','backend' => 'a690a','host' => 'DK-Oesterild-WAN1','scrollTo' => '1483','service' => 'check_cdp_neighbors','type' => '2'};
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] user:    nnit-espi
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.10.40.196
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:09:05][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:09:08][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:09:08][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/extinfo.cgi?type=2&host=DK-HED42-MK5-Testlab-VOB1-SRV1&service=mssql_database-file-dbcc-shrinks&backend=a690a&_=1611313746848
[2021/01/22 12:09:08][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611313746848','backend' => 'a690a','host' => 'DK-HED42-MK5-Testlab-VOB1-SRV1','service' => 'mssql_database-file-dbcc-shrinks','type' => '2'};
[2021/01/22 12:09:08][dkrdswppthrukp01.vestasext.net][ERROR] user:    nnit-sfpo
[2021/01/22 12:09:08][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.10.39.171
[2021/01/22 12:09:08][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:09:08][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:09:08][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:11:23][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:11:23][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?hidesearch=2&s0_op=~&s0_type=search&add_default_service_filter=1&s0_value=se:ppc_time&reload_nav=1&scrollTo=2478&_=1611313880544
[2021/01/22 12:11:23][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611313880544','add_default_service_filter' => '1','hidesearch' => '2','reload_nav' => '1','s0_op' => '~','s0_type' => 'search','s0_value' => 'se:ppc_time','scrollTo' => '2478'};
[2021/01/22 12:11:23][dkrdswppthrukp01.vestasext.net][ERROR] user:    nnit-drbi
[2021/01/22 12:11:23][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.224.69
[2021/01/22 12:11:23][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:11:23][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:11:23][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?backends=51656&backends=4120b&backends=53dec&backends=d4c78&backends=a103e&backends=bc29d&backends=51d28&backends=bde22&backends=5f10e&backends=a690a&backends=4120b&backends=53dec&backends=d4c78&backends=a103e&backends=bc29d&backends=51d28&backends=bde22&backends=5f10e&backends=a690a&dfl_s0_hostprops=262186&dfl_s0_hoststatustypes=2&dfl_s0_op==&dfl_s0_op=!~&dfl_s0_op=!~&dfl_s0_op=!~&dfl_s0_op=!=&dfl_s0_op=!~&dfl_s0_op=!~&dfl_s0_op=~&dfl_s0_op=!~&dfl_s0_op=>=&dfl_s0_op=!=&dfl_s0_serviceprops=262186&dfl_s0_servicestatustypes=16&dfl_s0_type=custom variable&dfl_s0_type=service&dfl_s0_type=host&dfl_s0_type=service&dfl_s0_type=service&dfl_s0_type=plugin output&dfl_s0_type=service&dfl_s0_type=host&dfl_s0_type=service&dfl_s0_type=duration&dfl_s0_type=service&dfl_s0_val_pre=OPRSTATUS&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_val_pre=&dfl_s0_value=InService&dfl_s0_value=wsus_updates|input_bad_status&dfl_s0_value=-wtg|-wan|-SW|PPC-MAIN&dfl_s0_value=patching_psp_upgrade|interface|envir_|eventlog_|Connectivity&dfl_s0_value=ambient_humidity|ambient_temperature&dfl_s0_value=Connection refused|Plugin timed out|snmpwalk returns no product name|problem connecting to|could not contact snmp agent|unlock_puppet&dfl_s0_value=service_&dfl_s0_value=vob|voc&dfl_s0_value=sql|ntp_|_mem_|task_SCADA_OPC_Agent_Task|task_unlock_puppet&dfl_s0_value=20m&dfl_s0_value=battery_remaining_time&dfl_s1_hostprops=262186&dfl_s1_hoststatustypes=2&dfl_s1_op==&dfl_s1_op=!=&dfl_s1_op=!~&dfl_s1_op=!~&dfl_s1_op=!=&dfl_s1_op=!~&dfl_s1_op=!=&dfl_s1_op=~&dfl_s1_serviceprops=262186&dfl_s1_servicestatustypes=16&dfl_s1_type=custom variable&dfl_s1_type=service&dfl_s1_type=host&dfl_s1_type=service&dfl_s1_type=service&dfl_s1_type=plugin output&dfl_s1_type=service&dfl_s1_type=service&dfl_s1_val_pre=OPRSTATUS&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_val_pre=&dfl_s1_value=InService&dfl_s1_value=wsus_updates&dfl_s1_value=-wtg|-wan|-SW|PPC-MAIN&dfl_s1_value=patching_psp_upgrade|interface|envir_|eventlog_|Connectivity&dfl_s1_value=ambient_humidity|ambient_temperature&dfl_s1_value=Connection refused|Plugin timed out|snmpwalk returns no product name|problem connecting to|could not contact snmp agent&dfl_s1_value=service_3rd_Puppet_Agent&dfl_s1_value=disk_c|disk_d|disk_e|disk_f&dfl_s2_hostprops=262186&dfl_s2_hoststatustypes=2&dfl_s2_op==&dfl_s2_op=!=&dfl_s2_op=!~&dfl_s2_op=!~&dfl_s2_op=!=&dfl_s2_op=!~&dfl_s2_op=!=&dfl_s2_op=~&dfl_s2_op=>=&dfl_s2_serviceprops=262186&dfl_s2_servicestatustypes=16&dfl_s2_type=custom variable&dfl_s2_type=service&dfl_s2_type=host&dfl_s2_type=service&dfl_s2_type=service&dfl_s2_type=plugin output&dfl_s2_type=service&dfl_s2_type=service&dfl_s2_type=duration&dfl_s2_val_pre=OPRSTATUS&dfl_s2_val_pre=&dfl_s2_val_pre=&dfl_s2_val_pre=&dfl_s2_val_pre=&dfl_s2_val_pre=&dfl_s2_val_pre=&dfl_s2_val_pre=&dfl_s2_val_pre=&dfl_s2_value=InService&dfl_s2_value=wsus_updates&dfl_s2_value=-wtg|-wan|-SW|PPC-MAIN&dfl_s2_value=patching_psp_upgrade|interface|envir_|eventlog_|Connectivity&dfl_s2_value=ambient_humidity|ambient_temperature&dfl_s2_value=Connection refused|Plugin timed out|snmpwalk returns no product name|problem connecting to|could not contact snmp agent&dfl_s2_value=service_3rd_Puppet_Agent&dfl_s2_value=sql&dfl_s2_value=1d&dfl_s3_hostprops=262186&dfl_s3_hoststatustypes=2&dfl_s3_op==&dfl_s3_op==&dfl_s3_op=~&dfl_s3_op=!~&dfl_s3_serviceprops=262186&dfl_s3_servicestatustypes=16&dfl_s3_type=custom variable&dfl_s3_type=service&dfl_s3_type=host&dfl_s3_type=host&dfl_s3_val_pre=OPRSTATUS&dfl_s3_val_pre=&dfl_s3_val_pre=&dfl_s3_val_pre=&dfl_s3_value=InService&dfl_s3_value=ntp_timesync&dfl_s3_value=vob|voc|ppc&dfl_s3_value=SW&dfl_s4_hostprops=262186&dfl_s4_hoststatustypes=2&dfl_s4_op==&dfl_s4_op=~&dfl_s4_op==&dfl_s4_serviceprops=262186&dfl_s4_servicestatustypes=16&dfl_s4_type=custom variable&dfl_s4_type=host&dfl_s4_type=service&dfl_s4_val_pre=OPRSTATUS&dfl_s4_val_pre=&dfl_s4_val_pre=&dfl_s4_value=InService&dfl_s4_value=PPC-MAIN&dfl_s4_value=ppc_time&dfl_s5_hostprops=262186&dfl_s5_hoststatustypes=2&dfl_s5_op==&dfl_s5_op=~&dfl_s5_op==&dfl_s5_serviceprops=262186&dfl_s5_servicestatustypes=20&dfl_s5_type=custom variable&dfl_s5_type=host&dfl_s5_type=service&dfl_s5_val_pre=OPRSTATUS&dfl_s5_val_pre=&dfl_s5_val_pre=&dfl_s5_value=InService&dfl_s5_value=PPC-Log&dfl_s5_value=ntp_timesync&dfl_s6_hostprops=262186&dfl_s6_hoststatustypes=2&dfl_s6_op==&dfl_s6_op=~&dfl_s6_op==&dfl_s6_serviceprops=262186&dfl_s6_servicestatustypes=16&dfl_s6_type=custom variable&dfl_s6_type=host&dfl_s6_type=service&dfl_s6_val_pre=OPRSTATUS&dfl_s6_val_pre=&dfl_s6_val_pre=&dfl_s6_value=InService&dfl_s6_value=PPC-Log&dfl_s6_value=connectivity_RDP&scrollTo=800&_=1611313885541
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611313885541','backends' => ['51656','4120b','53dec','d4c78','a103e','bc29d','51d28','bde22','5f10e','a690a','4120b','53dec','d4c78','a103e','bc29d','51d28','bde22','5f10e','a690a'],'dfl_s0_hostprops' => '262186','dfl_s0_hoststatustypes'...
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] user:    nnit-drbi
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.224.69
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] internal lmd error - did not get a valid response for at least any site at /usr/share/thruk/lib/Thruk/Backend/Manager.pm line 1880.
[2021/01/22 12:11:32][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:11:48][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:11:48][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?format=search
[2021/01/22 12:11:48][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'format' => 'search','limit' => '2000','query' => ''};
[2021/01/22 12:11:48][dkrdswppthrukp01.vestasext.net][ERROR] user:    jujrm
[2021/01/22 12:11:48][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.10.48.61
[2021/01/22 12:11:48][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:11:48][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:11:48][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?host=RU-Gukovo-26-VOB1-SRV1&style=detail&dfl_columns=host_name,description,state,last_check,duration,current_attempt,plugin_output,cust_OPRSTATUS&_=1611313908164
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611313908164','dfl_columns' => 'host_name,description,state,last_check,duration,current_attempt,plugin_output,cust_OPRSTATUS','host' => 'RU-Gukovo-26-VOB1-SRV1','style' => 'detail'};
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] user:    snoje
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.10.52.252
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] MED-3: peer is down: connecting... (172.26.66.57:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] NCE-1: peer is down: connecting... (172.26.66.52:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] NCE-3: peer is down: connecting... (172.26.66.53:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] MED-1: peer is down: connecting... (172.26.66.54:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] ASP-1: peer is down: connecting... (172.26.66.55:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] CHI-1: peer is down: connecting... (172.26.66.63:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] AME-1: peer is down: connecting... (172.26.66.56:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] Master: peer is down: connecting... (172.26.66.30:6558)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] NCE-2: peer is down: connecting... (172.26.66.58:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] MED-2: peer is down: connecting... (172.26.66.60:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:11:49][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:13:02][dkrdswppthrukp01.vestasext.net][INFO] [laeda][Master] cmd: COMMAND [1611313982] PROCESS_SERVICE_CHECK_RESULT;RU-Kazachiya-PPC-MAIN;Connectivity;0;newly commissioned|
[2021/01/22 12:13:02][dkrdswppthrukp01.vestasext.net][INFO] [laeda][Master] cmd: COMMAND [1611313982] PROCESS_SERVICE_CHECK_RESULT;RU-Tselinskaya-PPC-LOG-SRV;service_3rd_MongoDB;0;newly commissioned|
[2021/01/22 12:13:02][dkrdswppthrukp01.vestasext.net][INFO] [laeda][Master] cmd: COMMAND [1611313982] PROCESS_SERVICE_CHECK_RESULT;RU-Tselinskaya-PPC-MAIN;Connectivity;0;newly commissioned|
[2021/01/22 12:13:23][dkrdswppthrukp01.vestasext.net][INFO] [laeda][ASP-1] cmd: COMMAND [1611314003] PROCESS_SERVICE_CHECK_RESULT;TH-Romklao-PPC-MAIN;ppc_time;0;PPC Main Down|
[2021/01/22 12:13:32][dkrdswppthrukp01.vestasext.net][INFO] [laeda][ASP-1] cmd: COMMAND [1611314012] SCHEDULE_FORCED_SVC_CHECK;TH-Romklao-PPC-MAIN;ppc_time;1611314012
[2021/01/22 12:13:50][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:13:50][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 15.2s to load.
[2021/01/22 12:13:50][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/panorama.cgi?task=status
[2021/01/22 12:13:50][dkrdswppthrukp01.vestasext.net][WARN] params:  {'current_tab' => 'tabpan-tab_269','dfl_s0_hostprops' => 0,'dfl_s0_hoststatustypes' => 15,'dfl_s0_op' => ['=','=','='],'dfl_s0_serviceprops' => 0,'dfl_s0_servicestatustypes' => 31,'dfl_s0_type' => ['custom variable','service','custom variable'],'d...
[2021/01/22 12:13:50][dkrdswppthrukp01.vestasext.net][WARN] user:    chj
[2021/01/22 12:13:50][dkrdswppthrukp01.vestasext.net][WARN] address: 10.10.48.68
[2021/01/22 12:13:50][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:14:03][dkrdswppthrukp01.vestasext.net][INFO] [laeda][ASP-1] cmd: COMMAND [1611314043] SCHEDULE_FORCED_SVC_CHECK;TH-Romklao-PPC-MAIN;ppc_time;1611314043
[2021/01/22 12:14:04][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:14:04][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?hidesearch=2&s0_op=~&s0_type=search&add_default_service_filter=1&s0_value=se:ppc_time&reload_nav=1&_=1611314041543
[2021/01/22 12:14:04][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611314041543','add_default_service_filter' => '1','hidesearch' => '2','reload_nav' => '1','s0_op' => '~','s0_type' => 'search','s0_value' => 'se:ppc_time'};
[2021/01/22 12:14:04][dkrdswppthrukp01.vestasext.net][ERROR] user:    nnit-drbi
[2021/01/22 12:14:04][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.224.69
[2021/01/22 12:14:04][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:14:04][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:14:04][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:14:10][dkrdswppthrukp01.vestasext.net][INFO] [laeda][ASP-1] cmd: COMMAND [1611314050] SCHEDULE_FORCED_SVC_CHECK;TH-Romklao-PPC-MAIN;ppc_time;1611314050
[2021/01/22 12:14:23][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:14:23][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 16.9s to load.
[2021/01/22 12:14:23][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?hostgroup=all&style=summary&_=1611314044287
[2021/01/22 12:14:23][dkrdswppthrukp01.vestasext.net][WARN] params:  {'_' => '1611314044287','hostgroup' => 'all','style' => 'summary'};
[2021/01/22 12:14:23][dkrdswppthrukp01.vestasext.net][WARN] user:    jfgda
[2021/01/22 12:14:23][dkrdswppthrukp01.vestasext.net][WARN] address: 10.10.61.22
[2021/01/22 12:14:23][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:15:22][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:15:22][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 17.0s to load.
[2021/01/22 12:15:22][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/panorama.cgi?task=status
[2021/01/22 12:15:22][dkrdswppthrukp01.vestasext.net][WARN] params:  {'current_tab' => 'tabpan-tab_269','dfl_s0_hostprops' => 0,'dfl_s0_hoststatustypes' => 15,'dfl_s0_op' => ['=','=','='],'dfl_s0_serviceprops' => 0,'dfl_s0_servicestatustypes' => 31,'dfl_s0_type' => ['custom variable','service','custom variable'],'d...
[2021/01/22 12:15:22][dkrdswppthrukp01.vestasext.net][WARN] user:    chj
[2021/01/22 12:15:22][dkrdswppthrukp01.vestasext.net][WARN] address: 10.10.48.68
[2021/01/22 12:15:22][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:15:45][dkrdswppthrukp01.vestasext.net][INFO] [dmvlk][Master] cmd: COMMAND [1611314145] SCHEDULE_FORCED_SVC_CHECK;NL-GeefsweerLotC-VOC1-SRV;eventlog_Puppet_Agent;1611314145
[2021/01/22 12:16:47][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:16:47][dkrdswppthrukp01.vestasext.net][ERROR] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/status.cgi?hidesearch=2&s0_op=~&s0_type=search&add_default_service_filter=1&s0_value=se:ppc_time&reload_nav=1&_=1611314204548
[2021/01/22 12:16:47][dkrdswppthrukp01.vestasext.net][ERROR] params:  {'_' => '1611314204548','add_default_service_filter' => '1','hidesearch' => '2','reload_nav' => '1','s0_op' => '~','s0_type' => 'search','s0_value' => 'se:ppc_time'};
[2021/01/22 12:16:47][dkrdswppthrukp01.vestasext.net][ERROR] user:    nnit-drbi
[2021/01/22 12:16:47][dkrdswppthrukp01.vestasext.net][ERROR] address: 10.219.224.69
[2021/01/22 12:16:47][dkrdswppthrukp01.vestasext.net][ERROR] No Backend available
[2021/01/22 12:16:47][dkrdswppthrukp01.vestasext.net][ERROR] NCE-4: peer is down: connecting... (172.26.66.59:6557)
[2021/01/22 12:16:47][dkrdswppthrukp01.vestasext.net][ERROR] ***************************
[2021/01/22 12:16:51][dkrdswppthrukp01.vestasext.net][WARN] ***************************
[2021/01/22 12:16:51][dkrdswppthrukp01.vestasext.net][WARN] slow_page_log_threshold (15s) hit, page took 16.7s to load.
[2021/01/22 12:16:51][dkrdswppthrukp01.vestasext.net][WARN] page:    http://wpp-monitoring.vestasext.net/thruk/cgi-bin/panorama.cgi?task=status
[2021/01/22 12:16:51][dkrdswppthrukp01.vestasext.net][WARN] params:  {'current_tab' => 'tabpan-tab_269','dfl_s0_hostprops' => 0,'dfl_s0_hoststatustypes' => 15,'dfl_s0_op' => ['=','=','='],'dfl_s0_serviceprops' => 0,'dfl_s0_servicestatustypes' => 31,'dfl_s0_type' => ['custom variable','service','custom variable'],'d...
[2021/01/22 12:16:51][dkrdswppthrukp01.vestasext.net][WARN] user:    chj
[2021/01/22 12:16:51][dkrdswppthrukp01.vestasext.net][WARN] address: 10.10.48.68
[2021/01/22 12:16:51][dkrdswppthrukp01.vestasext.net][WARN] Profile:
[2021/01/22 12:17:16][dkrdswppthrukp01.vestasext.net][WARN] ***************************

Incremental corruption

Hello,

We use LMD to attack 1 nagios and 6 Icinga 2 system.
It seems the incremental update of status is doing some internal corruption of the cache. We see that some services get their status and all other information corrupted by another service, like plugin_output, status, etc ...
For example :
[host_name,display_name,plugin_output]
["xxx-yyy-bo02","HYCU Last Backup","Updates: 3 critical, 2 optional"]

If we wait some time (full update) or restart we have the good information :
["xxx-yyy-bo02","HYCU Last Backup","Status=OK, Compliancy=GREEN, Date=2021-02-25T07:04:05.355000"]

It happens more with system that have some long delay to respond (we have around 250ms of latency between lmd server and Icinga). Never seen on Nagios system. And also the system that we have most of the time the problem is in High Avaibility (2 icinga connection declared in Lmd)

I tried to look at the code, but quite difficult for me.
If you need more information, please tell me.

Thanks a lot.

i/o timeout on shinken backend

This commit 96be7c9 break shinken livestatus compatibility. Removing the test around the switch solve the issue.

That might be because shinken doesn't close socket at the end of the query. I don't know who is right so i'm asking for guidance to propose a fix.

fatal error: concurrent map read and map write

Hi there,

Let me begin by thanking you for this tool, pretty awesome!

We have some shinken deployed and we use LMD as cache to avoid livestatus overheating. In this particular case the setup is a shinken-2.4.3 with over 5k hosts with 20k services, all components installed on the same server, even LMD.

In a few ocasions, lmd go offline and exit with the following error:

fatal error: concurrent map read and map write

goroutine 456549 [running]:
runtime.throw(0x8a03f5, 0x21)
	/root/.gvm/gos/go1.7.4/src/runtime/panic.go:566 +0x95 fp=0xc42c441bf0 sp=0xc42c441bd0
runtime.mapaccess1_faststr(0x81b260, 0xc420167dd0, 0xc42eb92224, 0x8, 0x1)
	/root/.gvm/gos/go1.7.4/src/runtime/hashmap_fast.go:201 +0x4f3 fp=0xc42c441c50 sp=0xc42c441bf0
main.(*Peer).BuildLocalResponseData(0xc4201be2d0, 0xc42649d4a0, 0xc4269add00, 0x0, 0x0, 0x0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:1707 +0x14e fp=0xc42c441d58 sp=0xc42c441c50
main.(*Response).BuildLocalResponse.func1(0xc4201be2d0, 0xc42649d4a0, 0xc4269add00, 0xc42eb922a8, 0xc4201be2d0, 0xc42eb922d0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/response.go:516 +0x152 fp=0xc42c441f80 sp=0xc42c441d58
runtime.goexit()
	/root/.gvm/gos/go1.7.4/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc42c441f88 sp=0xc42c441f80
created by main.(*Response).BuildLocalResponse
	/var/omnibus/src/src/github.com/sni/lmd/lmd/response.go:540 +0x27a

goroutine 1 [select, 1532 minutes]:
main.mainLoop(0xc4201bc360, 0x8d4500)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/main.go:217 +0x9cb
main.main()
	/var/omnibus/src/src/github.com/sni/lmd/lmd/main.go:149 +0x5b

goroutine 17 [syscall, 1532 minutes, locked to thread]:
runtime.goexit()
	/root/.gvm/gos/go1.7.4/src/runtime/asm_amd64.s:2086 +0x1

goroutine 33 [select]:
main.LocalListenerLivestatus(0xc4201d21c0, 0x88d080, 0x3, 0xc420244090, 0xd, 0xc420244f00, 0xc4201bc780)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:227 +0x73d
main.LocalListener(0xc4201d21c0, 0xc420244090, 0xd, 0xc420244f00, 0xc420244f10, 0xc4201bc780)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:173 +0x2e1
main.mainLoop.func1(0xc4201d21c0, 0xc420244f00, 0xc420244f10, 0xc4201bc780, 0xc420244090, 0xd)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/main.go:206 +0x7f
created by main.mainLoop
	/var/omnibus/src/src/github.com/sni/lmd/lmd/main.go:207 +0x708

goroutine 6 [syscall, 1532 minutes]:
os/signal.signal_recv(0x0)
	/root/.gvm/gos/go1.7.4/src/runtime/sigqueue.go:116 +0x157
os/signal.loop()
	/root/.gvm/gos/go1.7.4/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.1
	/root/.gvm/gos/go1.7.4/src/os/signal/signal_unix.go:28 +0x41

goroutine 18 [select, 1532 minutes, locked to thread]:
runtime.gopark(0x8d4d40, 0x0, 0x88f45e, 0x6, 0x18, 0x2)
	/root/.gvm/gos/go1.7.4/src/runtime/proc.go:259 +0x13a
runtime.selectgoImpl(0xc42004af30, 0x0, 0x18)
	/root/.gvm/gos/go1.7.4/src/runtime/select.go:423 +0x11d9
runtime.selectgo(0xc42004af30)
	/root/.gvm/gos/go1.7.4/src/runtime/select.go:238 +0x1c
runtime.ensureSigM.func1()
	/root/.gvm/gos/go1.7.4/src/runtime/signal1_unix.go:304 +0x2f3
runtime.goexit()
	/root/.gvm/gos/go1.7.4/src/runtime/asm_amd64.s:2086 +0x1

goroutine 34 [IO wait, 1532 minutes]:
net.runtime_pollWait(0x7f565aff3e40, 0x72, 0x0)
	/root/.gvm/gos/go1.7.4/src/runtime/netpoll.go:160 +0x59
net.(*pollDesc).wait(0xc42017baa0, 0x72, 0xc42005dc38, 0xc420010038)
	/root/.gvm/gos/go1.7.4/src/net/fd_poll_runtime.go:73 +0x38
net.(*pollDesc).waitRead(0xc42017baa0, 0xa75440, 0xc420010038)
	/root/.gvm/gos/go1.7.4/src/net/fd_poll_runtime.go:78 +0x34
net.(*netFD).accept(0xc42017ba40, 0x0, 0xa73e00, 0xc4202820c0)
	/root/.gvm/gos/go1.7.4/src/net/fd_unix.go:419 +0x238
net.(*UnixListener).accept(0xc420282060, 0x4820de, 0xc42005dcf8, 0x40c060)
	/root/.gvm/gos/go1.7.4/src/net/unixsock_posix.go:158 +0x32
net.(*UnixListener).Accept(0xc420282060, 0x8d44c8, 0xc4201bc780, 0x88d8e2, 0x4)
	/root/.gvm/gos/go1.7.4/src/net/unixsock.go:229 +0x49
main.LocalListenerLivestatus(0xc4201d21c0, 0x88d8e2, 0x4, 0xc420234320, 0x1a, 0xc420244f00, 0xc4201bc780)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:210 +0x3d5
main.LocalListener(0xc4201d21c0, 0xc420234320, 0x1a, 0xc420244f00, 0xc420244f10, 0xc4201bc780)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:180 +0x3a6
main.mainLoop.func1(0xc4201d21c0, 0xc420244f00, 0xc420244f10, 0xc4201bc780, 0xc420234320, 0x1a)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/main.go:206 +0x7f
created by main.mainLoop
	/var/omnibus/src/src/github.com/sni/lmd/lmd/main.go:207 +0x708

goroutine 50 [chan receive, 1532 minutes]:
main.LocalListenerLivestatus.func1(0xc4201bc780, 0x88d080, 0x3, 0xc420244090, 0xd, 0xa78080, 0xc42015c010)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:201 +0x4d
created by main.LocalListenerLivestatus
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:205 +0x3b0

goroutine 9 [chan receive, 1532 minutes]:
main.LocalListenerLivestatus.func1(0xc4201bc780, 0x88d8e2, 0x4, 0xc420234320, 0x1a, 0xa78100, 0xc420282060)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:201 +0x4d
created by main.LocalListenerLivestatus
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:205 +0x3b0

goroutine 35 [runnable]:
runtime.Caller(0x2, 0x4ae0b9, 0xc42a673540, 0xc42a6735a9, 0x19, 0x0)
	/root/.gvm/gos/go1.7.4/src/runtime/extern.go:178 +0x82
main.(*LoggingLock).RLockN(0xc420167e60, 0x2)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/logginglock.go:47 +0x40
main.(*Peer).StatusGet(0xc4201be2d0, 0x890d23, 0x8, 0x67, 0xffffffffffffffff)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:363 +0x3c
main.(*Peer).GetConnection(0xc4201be2d0, 0xc42515ea50, 0xc42515ea50, 0x6019b9, 0xc4201661b0, 0xc420232820, 0x0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:996 +0xa5
main.(*Peer).query(0xc4201be2d0, 0xc42515f530, 0x0, 0x0, 0x0, 0x0, 0x0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:800 +0x7f
main.(*Peer).Query(0xc4201be2d0, 0xc42515f530, 0xc42515edf8, 0xc4201f4b60, 0x2, 0x2, 0x2)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:937 +0x39
main.(*Peer).UpdateDeltaCommentsOrDowntimes(0xc4201be2d0, 0x891c00, 0x9, 0x0, 0x0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:710 +0x1b8
main.(*Peer).UpdateDeltaTables(0xc4201be2d0, 0x89241d)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:500 +0xb59
main.(*Peer).periodicUpdate(0xc4201be2d0, 0xc42515fdab, 0xc42515fdb0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:328 +0x381
main.(*Peer).updateLoop(0xc4201be2d0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:273 +0x20f
main.(*Peer).Start.func1(0xc4201be2d0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:188 +0x4f
created by main.(*Peer).Start
	/var/omnibus/src/src/github.com/sni/lmd/lmd/peer.go:193 +0x1b9

goroutine 456548 [semacquire]:
sync.runtime_Semacquire(0xc42eb922dc)
	/root/.gvm/gos/go1.7.4/src/runtime/sema.go:47 +0x30
sync.(*WaitGroup).Wait(0xc42eb922d0)
	/root/.gvm/gos/go1.7.4/src/sync/waitgroup.go:131 +0x97
main.(*Response).BuildLocalResponse(0xc42649d4a0, 0xc42eb922c0, 0x1, 0x1, 0xc4269add00, 0xc42eb922c0, 0x0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/response.go:543 +0x4a1
main.NewResponse(0xc422bc5540, 0x892dbe, 0xb, 0x0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/response.go:98 +0x7ca
main.(*Request).GetResponse(0xc422bc5540, 0x2ec3ac50, 0xa9aae0, 0x0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/request.go:304 +0x176
main.ProcessRequests(0xc42391d648, 0x1, 0x1, 0xa7ae80, 0xc42391d640, 0xc4269ad860, 0x14, 0x0, 0x0, 0x0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:92 +0x163
main.QueryServer(0xa7ae80, 0xc42391d640, 0xc42f5c2640, 0x0)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:45 +0x58f
main.LocalListenerLivestatus.func2(0xc42649d380, 0xa7ae80, 0xc42391d640)
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:225 +0x55
created by main.LocalListenerLivestatus
	/var/omnibus/src/src/github.com/sni/lmd/lmd/listener.go:226 +0x47b

Any clue about the error?

Thanks in advance.

Correct logrotation of logfile

What is the suggested way of rotating the log for lmd without restarting it?
Would copytruncate work?
Is there support for receiving a signal to reopen the file?

TIA

[Warn][response.go:361] client error: bad request: backend 7215e does not exist

I get below error log on lmd.log file while browsing Thruk GUI. None of the backends display. Please help me on this.

[2017-02-18 16:15:54][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:15:55][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:15:56][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:11][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:12][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:14][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:15][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:16][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:17][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:20][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:21][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:29][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:30][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:16:35][Info][peer.go:277] [Monitoring Site B] switched to idle interval, last query: never
[2017-02-18 16:16:35][Info][peer.go:277] [Monitoring Site A] switched to idle interval, last query: never
[2017-02-18 16:17:14][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:17:15][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:18:15][Warn][response.go:361] client error: bad request: backend 7215e does not exist
[2017-02-18 16:18:16][Warn][response.go:361] client error: bad request: backend 7215e does not exist

id above 999999

When comments or downtimes id are above 999999, lmd issue a wrong query to the backend with id written in scientific notation in https://github.com/sni/lmd/blob/master/lmd/peer.go#L908

lmd receive a error code, flush his cache and reload everything.

That must be because lmd treat integer as float64.

I'm using shinken as a backend and downtime id are computed using timestamp in milliseconds.

Here is an example of a wrong query :

GET downtimes
ResponseHeader: fixed16
OutputFormat: json
Columns: author comment duration end_time entry_time fixed
 id is_service start_time triggered_by type host_name service_description host_contacts service_contacts
Filter: id = 1.531507750159428e+15
Or: 1

Weird chars in logfile and "no backend available"

LMD version 1.6.1
Connecting locally to naemon-livestatus 1.0.6
Used by Thruk 2.18 with sock file
The same instance is used by a cluster of Thruk 2.18 hosts via network socket

I see often in the logfile:

[2019-09-09 09:35:57.787][Info][listener.go:147] incoming comments request from 10.70.96.25:34724 to 10.71.6.20:6560 finished in 50.876<C2><B5>s, response size: 3 B
[2019-09-09 09:35:57.789][Info][listener.go:147] incoming downtimes request from 10.70.96.27:34726 to 10.71.6.20:6560 finished in 59.395<C2><B5>s, response size: 3 B
[2019-09-09 09:35:57.789][Info][listener.go:147] incoming downtimes request from 10.70.96.24:34728 to 10.71.6.20:6560 finished in 46.68<C2><B5>s, response size: 3 B
[2019-09-09 09:35:57.792][Info][listener.go:147] incoming comments request from 10.70.96.26:34730 to 10.71.6.20:6560 finished in 60.293<C2><B5>s, response size: 3 B
[2019-09-09 09:35:57.794][Info][listener.go:147] incoming services request from 10.70.96.24:34732 to 10.71.6.20:6560 finished in 47.873<C2><B5>s, response size: 3 B
[2019-09-09 09:35:57.795][Info][listener.go:147] incoming downtimes request from 10.70.96.26:34734 to 10.71.6.20:6560 finished in 56.716<C2><B5>s, response size: 3 B
[2019-09-09 09:35:57.801][Info][listener.go:147] incoming services request from 10.70.96.24:34736 to 10.71.6.20:6560 finished in 50.324<C2><B5>s, response size: 3 B
[2019-09-09 09:35:57.812][Info][listener.go:147] incoming comments request from 10.70.96.24:43568 to 10.71.6.20:6560 finished in 50.678<C2><B5>s, response size: 3 B
[2019-09-09 09:35:57.815][Info][listener.go:147] incoming downtimes request from 10.70.96.25:43570 to 10.71.6.20:6560 finished in 46.768<C2><B5>s, response size: 3 B

I wonder why there are when normally it's "m" for milliseconds... Also the response size of 3B is strange...
Also on Thruk log file I see often

[2019/09/09 10:23:04][yanez][ERROR][Thruk] Cached data: ERROR: socket error. (/var/cache/naemon/lmd.sock)
[2019/09/09 10:23:04][yanez][ERROR][Thruk] Error in: /thruk/cgi-bin/status.cgi
[2019/09/09 10:23:04][yanez][ERROR][Thruk] socket error at /usr/share/thruk/lib/Monitoring/Livestatus/Class/Lite.pm line 380.
 at /usr/lib64/thruk/perl5/Plack/Util.pm line 142.
        eval {...} called at /usr/lib64/thruk/perl5/Plack/Util.pm line 142
        Plack::Util::run_app('CODE(0x1d129c0)', 'HASH(0x286c5c0)') called at /usr/lib64/thruk/perl5/Plack/Handler/FCGI.pm line 143
        Plack::Handler::FCGI::run('Plack::Handler::FCGI=HASH(0x1214c48)', 'CODE(0x1d129c0)') called at /usr/share/thruk/script/thruk_fastcgi.pl line 24
[2019/09/09 10:23:04][yanez][ERROR][Thruk] No Backend available

While the process is running correctly.
Any idea on what can cause this (weird chars and "no backend available" maybe are not related at all)?

Panic: got no index when updating delta comments

We're seeing this panic being thrown in lmd.log when requests for the logs table are received by lmd. These requests are initiated by a batch that is requesting multiple logs. The request that is being handled when this panic happens gets interrumpted; further requests are never done because of a connection error.

Update: as explained in a comment below, I got to see that these panics have nothing to do with requests to a particular table such as logs.

We're still testing this issue looking for additional clues that don't involve restarting critical services such as shinken, as this is a big deployment where the downtime during the reload of the shinken configuration would take quite some time. However, I'm opening this issue in case this is actually a bug in lmd itself.

[2018-11-07 11:31:13][Panic][peer.go:1108] got null index
[2018-11-07 11:31:13][Error][peer.go:2696] [livestatus-socket by socket] Panic: got null index
[2018-11-07 11:31:13][Error][peer.go:2697] [livestatus-socket by socket] goroutine 42 [running]:
runtime/debug.Stack(0xc4201b6600, 0x92fc34, 0xe)
	/home/danirod/.gvm/gos/go1.9.2/src/runtime/debug/stack.go:24 +0xa7
main.logPanicExitPeer(0xc420120180)
	/home/danirod/go/src/github.com/sni/lmd/lmd/peer.go:2697 +0x12f
panic(0x87e9a0, 0xc42358eb80)
	/root/.gvm/gos/go1.9.2/src/runtime/panic.go:491 +0x283
github.com/kdar/factorlog.(*FactorLog).Panicf(0xc4201b6600, 0x92fde6, 0xe, 0x0, 0x0, 0x0)
	/home/danirod/go/src/github.com/kdar/factorlog/factorlog.go:411 +0x145
main.(*Peer).UpdateDeltaCommentsOrDowntimes(0xc420120180, 0x92c49d, 0x8, 0x0, 0x0)
	/home/danirod/go/src/github.com/sni/lmd/lmd/peer.go:1108 +0x177b
main.(*Peer).UpdateDeltaTables(0xc420120180, 0x92d76a)
	/home/danirod/go/src/github.com/sni/lmd/lmd/peer.go:779 +0xb1d
main.(*Peer).periodicUpdate(0xc420120180, 0xc420073e7e, 0xc420073e80)
	/home/danirod/go/src/github.com/sni/lmd/lmd/peer.go:398 +0x370
main.(*Peer).updateLoop(0xc420120180)
	/home/danirod/go/src/github.com/sni/lmd/lmd/peer.go:329 +0x302
main.(*Peer).Start.func1(0xc420120180)
	/home/danirod/go/src/github.com/sni/lmd/lmd/peer.go:228 +0x59
created by main.(*Peer).Start
	/home/danirod/go/src/github.com/sni/lmd/lmd/peer.go:225 +0x17f

[2018-11-07 11:31:13][Error][peer.go:2699] [livestatus-socket by socket] LastQuery:
[2018-11-07 11:31:13][Error][peer.go:2700] [livestatus-socket by socket] GET comments
ResponseHeader: fixed16
OutputFormat: json
Columns: author comment entry_time entry_type expires expire_time id is_service persistent source type host_name service_description host_contacts service_contacts
Filter: id = 429
[and then a list of about a thousand filters on the id column that I've omitted for brevity]

LMD 1.9.4 - disorder in `custom_variable_names` and `custom_variable_values`

Here is an example:

# telnet localhost 50000 
Trying ::1...
Connected to wocu-monitoring-aio.
Escape character is '^]'.
GET hosts
Columns: custom_variable_names custom_variable_values
Filter: host_name = mysnmphostname
[[["SNMPCOMMUNITY","IFACES_BYNAME","LAT","MEMFREE_CRITICAL_THRESHOLD","LONG","SNMPVERSION","TRAFFIC_CRITICAL_THRESHOLD","CPU_WARNING_THRESHOLD","CPU_CRITICAL_THRESHOLD","MEMFREE_WARNING_THRESHOLD","TRAFFIC_WARNING_THRESHOLD","TRAFFIC_SUM_IFACES_REGEXP","DEVICEVENDOR"],["public","ethernet0/1.2815$(ethernet0/1.2815)$$(1000000000)$$(500000000)$$(b)$","10","2c","90","75","90","25","75","^ethernet0/1.2815$$","Teldat"]]

names and values don't match each other:

names = ["SNMPCOMMUNITY","IFACES_BYNAME","LAT","MEMFREE_CRITICAL_THRESHOLD","LONG","SNMPVERSION","TRAFFIC_CRITICAL_THRESHOLD","CPU_WARNING_THRESHOLD","CPU_CRITICAL_THRESHOLD","MEMFREE_WARNING_THRESHOLD","TRAFFIC_WARNING_THRESHOLD","TRAFFIC_SUM_IFACES_REGEXP","DEVICEVENDOR"]

values = ["public","ethernet0/1.2815$(ethernet0/1.2815)$$(1000000000)$$(500000000)$$(b)$","10","2c","90","75","90","25","75","^ethernet0/1.2815$$","Teldat"]

print(names[5])
SNMPVERSION

print(values[5])
75  # this is a threshold value, it should be the snmp version

print(value[3])
2c  # here it is!

>>> len(names)
13

>>> len(values)
11

Restarting lmd (full-scan) fixes the problem but in a few minutes, the list are messed up again, so the bug should be in the delta-update.

Version 1.9.1 is working fine.

socket: too many open files / live.sock: bind: address already in use

Hi Sven,

Am running an LMD cluster and usually get the error message below if i stop and start one the LMD process in the cluster. LMD will recover but will produce the same error again. My workaround is to kill the connection and starts LMD again. But is there a way for LMD to handle this? Thanks in advance

Specs:
LMD 1.3.0 cluster (2 LMD server and a thruk Server)
Thruk Version 2.16~2
Nagios 4.3.1 and 4.1.1
MK-livestatus 1.2.6p15

Error Log:

No Backend available
None of the configured Backends could be reached, please have a look at the logfile for detailed information and make sure the core is up and running.

Details:
/var/cache/thruk/lmd/live.sock: The request contains an invalid header. - Post http://172.26.66.70:8080/query: dial tcp 172.26.66.70:8080: socket: too many open files
at /usr/share/thruk/lib/Monitoring/Livestatus/Class/Lite.pm line 380

net/http.(*connReader).Read(0xc422332600, 0xc422342000, 0x1000, 0x1000, 0xc422193d38, 0x5b19fc, 0xc42211de80)
/usr/lib/golang/src/net/http/server.go:753 +0x105
bufio.(*Reader).fill(0xc42217cde0)
/usr/lib/golang/src/bufio/bufio.go:97 +0x11a
bufio.(*Reader).Peek(0xc42217cde0, 0x4, 0x1895617d4, 0xba39e0, 0x0, 0x0, 0xba39e0)
/usr/lib/golang/src/bufio/bufio.go:129 +0x3a
net/http.(*conn).serve(0xc422095040, 0xb685e0, 0xc42217a780)
/usr/lib/golang/src/net/http/server.go:1826 +0x88f
created by net/http.(*Server).Serve
/usr/lib/golang/src/net/http/server.go:2720 +0x288

[2018-06-14 10:41:05][Info][peer.go:677] [nagt99] updating objects failed after: 214.98ยตs: dial tcp 172.26.66.208:6557: socket: too many open files
[2018-06-14 10:41:05][Info][main.go:465] got sigint, quitting
[2018-06-14 10:41:05][Info][listener.go:327] stopping listener on :8080
[2018-06-14 10:41:05][Info][listener.go:253] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2018-06-14 10:41:05][Info][listener.go:266] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2018-06-14 10:41:05][Info][listener.go:253] stopping unix listener on /var/cache/thruk/lmd/live.sock
[2018-06-14 10:41:05][Info][listener.go:266] unix listener /var/cache/thruk/lmd/live.sock shutdown complete
[2018-06-14 10:41:05][Warn][response.go:457] sending error response: 400 - Post http://172.26.66.201:8080/query: read tcp 172.26.66.70:51118->172.26.66.201:8080: read: connection reset by peer
[2018-06-14 10:41:12][Info][listener.go:248] listening for incoming queries on unix /var/cache/thruk/lmd/live.sock
[2018-06-14 10:41:12][Fatal][listener.go:240] listen error: listen unix /var/cache/thruk/lmd/live.sock: bind: address already in use

[2018-06-14 10:43:51][Warn][response.go:457] sending error response: 400 - Post http://172.26.66.70:8080/query: dial tcp 172.26.66.70:8080: socket: too many open files

Panic: runtime error: index out of range

hi sven,

Several backend gets disconnected from time to time, and theres a delay when submitting a passive checks or adding a comments. Any thoughts on this?

Environment
OS: RHel6
Thruk: 2.32-3
LMD: lmd - version 1.8.2 (Build: 1a9f3e3)
Backend: Nagios and Icinga

lmd.ini
StaleBackendTimeout = 3600
NetTimeout = 240
ConnectTimeout = 30
BackendKeepAlive = false
FullUpdateInterval = 600

lmd.log
[2020-11-26 13:10:04.146][Error][peer.go:2718] [Master] Panic: runtime error: index out of range [1] with length 1[2020-11-26 13:10:04.146][Error][peer.go:2719] [Master] Version: 1.8.2 (Build: 1a9f3e3)
[2020-11-26 13:10:04.146][Error][peer.go:2720] [Master] goroutine 210285 [running]:
runtime/debug.Stack(0x95aac0, 0xc0132543a0, 0x2)
/usr/local/go/src/runtime/debug/stack.go:24 +0x9d
main.logPanicExitPeer(0xc0001c4e00)
/root/lmd/lmd/peer.go:2720 +0x39c
panic(0x9e07e0, 0xc1015ab040)
/usr/local/go/src/runtime/panic.go:969 +0x166
main.VirtColCustomVariables(0xc00d5613b0, 0xc000261db0, 0x985c00, 0xc03a7916e0)
/root/lmd/lmd/datarow.go:570 +0x24a
main.(*DataRow).getVirtRowValue(0xc00d5613b0, 0xc000261db0, 0xc03a7916e0, 0xc00d56af78)
/root/lmd/lmd/datarow.go:407 +0x21b
main.(*DataRow).GetHashMap(0xc00d5613b0, 0xc000261db0, 0xc01c014b00)
/root/lmd/lmd/datarow.go:317 +0x4b
main.(*Filter).Match(0xc17d6ffc00, 0xc00d5613b0, 0xc046818100)
/root/lmd/lmd/filter.go:485 +0x31e
main.(*DataRow).MatchFilter(0xc00d5613b0, 0xc17d6ffc00, 0x0)
/root/lmd/lmd/datarow.go:657 +0x189
main.(*DataRow).MatchFilter(0xc00d5613b0, 0xc17d6ffc80, 0xc054893d01)
/root/lmd/lmd/datarow.go:618 +0xa5
main.(*DataRow).MatchFilter(0xc00d5613b0, 0xc17d6ffd00, 0x1)
/root/lmd/lmd/datarow.go:618 +0xa5
main.(*Peer).gatherResultRows(0xc0001c4e00, 0xc127fa2a80, 0xc078d1e000, 0xc127fa2ae0)
/root/lmd/lmd/peer.go:2576 +0x105
main.(*Peer).BuildLocalResponseData(0xc0001c4e00, 0xc127fa2a80, 0xc078d1e000, 0xc127fa2ae0)
/root/lmd/lmd/peer.go:2476 +0x251
main.(*Response).BuildLocalResponse.func2(0xc127fa2a80, 0xc078d1e000, 0xc127fa2ae0, 0xc0001c4e00, 0xc0b94a84d0)
/root/lmd/lmd/response.go:505 +0x19e
created by main.(*Response).BuildLocalResponse
/root/lmd/lmd/response.go:498 +0x20b
[2020-11-26 13:10:04.146][Error][peer.go:2722] [Master] LastQuery:
[2020-11-26 13:10:04.147][Error][peer.go:2723] [Master] GET services
ResponseHeader: fixed16
OutputFormat: json
Columns: host_name description accept_passive_checks acknowledged acknowledgement_type active_checks_enabled check_freshness check_options check_type checks_enabled current_attempt current_notification_number custom_variable_values event_handler_enabled execution_time first_notification_delay flap_detection_enabled has_been_checked in_check_period in_notification_period is_executing is_flapping last_check last_hard_state last_hard_state_change last_notification last_state last_state_change last_time_critical last_time_warning last_time_ok last_time_unknown latency long_plugin_output low_flap_threshold modified_attributes modified_attributes_list next_check next_notification notifications_enabled obsess_over_service percent_state_change perf_data plugin_output process_performance_data scheduled_downtime_depth state state_type staleness pnpgraph_present check_source

LMD clustering failing

LMD clustering is failing for us with errors:
lmd.log:

[2018-09-19 18:42:33][Warn][response.go:457] sending error response: 500 - Post http://10.3.79.219:6556/query: EOF
[2018-09-19 18:42:33][Warn][response.go:432] write error: write unix /omd/sites/core/tmp/thruk/lmd/live.sock->@: i/o timeout
[2018-09-19 18:54:33][Warn][response.go:457] sending error response: 500 - Post http://10.3.79.219:6556/query: EOF
[2018-09-19 18:54:33][Warn][response.go:432] write error: write unix /omd/sites/core/tmp/thruk/lmd/live.sock->@: i/o timeout
[2018-09-19 18:54:37][Warn][response.go:457] sending error response: 400 - Post http://10.3.79.218:6556/query: dial tcp 10.3.79.218:6556: connect: cannot assign requested address

Browser:

lmd error - /omd/sites/core/tmp/thruk/lmd/live.sock: The request contains an invalid header. - Post http://10.3.79.218:6556/query: dial tcp 10.3.79.218:6556: connect: cannot assign requested address
at /omd/sites/core/share/thruk/lib/Monitoring/Livestatus/Class/Lite.pm line 386.

It was flaky with 5 backends, then settled down. Increased to 24 backends and won't work. I increased the lmd NetTimeout already. We have over 2000 backends to add.

Sidenote, is the new Thruk clustering a better approach than LMD clustering? It does not look like both could be used at the same time.

] Panic: interface conversion: interface {} is nil, not string

lmd - version 1.3.7 (Build: 2.90-labs-edition-v1.3.7)
OMD - Open Monitoring Distribution Version 2.90-labs-edition

The logs below started when i acknowledged a host problem with the commend "Einai proswrina Down" with no expiration time.
Any ideas on what is gong on with the LMD ?

[2019-12-23 00:41:07][Error][peer.go:2693] [noc] Panic: interface conversion: interface {} is nil, not string
[2019-12-23 00:41:07][Error][peer.go:2694] [noc] goroutine 6 [running]:
runtime/debug.Stack(0xc0001a6030, 0x954ad4, 0xe)
        /opt/projects/omd/packages/go-1.11/go-1.11.4/src/runtime/debug/stack.go:24 +0xa7
main.logPanicExitPeer(0xc00023c180)
        /opt/projects/omd/packages/lmd/go/src/github.com/sni/lmd/lmd/peer.go:2694 +0x125
panic(0x8cbc80, 0xc00052a240)
        /opt/projects/omd/packages/go-1.11/go-1.11.4/src/runtime/panic.go:513 +0x1b9
main.(*Peer).expandCrossServiceReferences(0xc00023c180, 0xc0001aed20, 0xc000bafc40, 0xc000bafce8, 0xc000205c80, 0x0, 0x0, 0x0)
        /opt/projects/omd/packages/lmd/go/src/github.com/sni/lmd/lmd/peer.go:2818 +0x867
main.(*Peer).initilizeReferences(0xc00023c180, 0xc0001aed20, 0xc000bafce8, 0xc00052a210, 0x0, 0x0)
        /opt/projects/omd/packages/lmd/go/src/github.com/sni/lmd/lmd/peer.go:2756 +0x28f
main.(*Peer).CreateObjectByType(0xc00023c180, 0xc0001aed20, 0x95138a, 0x8)
        /opt/projects/omd/packages/lmd/go/src/github.com/sni/lmd/lmd/peer.go:1532 +0x2c1
main.(*Peer).InitAllTables(0xc00023c180, 0xc0001a76e0)
        /opt/projects/omd/packages/lmd/go/src/github.com/sni/lmd/lmd/peer.go:663 +0x915
main.(*Peer).updateLoop(0xc00023c180)
        /opt/projects/omd/packages/lmd/go/src/github.com/sni/lmd/lmd/peer.go:314 +0x596
main.(*Peer).Start.func1(0xc00023c180)
        /opt/projects/omd/packages/lmd/go/src/github.com/sni/lmd/lmd/peer.go:244 +0x59
created by main.(*Peer).Start
        /opt/projects/omd/packages/lmd/go/src/github.com/sni/lmd/lmd/peer.go:241 +0x178
[2019-12-23 00:41:07][Error][peer.go:2696] [noc] LastQuery:
[2019-12-23 00:41:07][Error][peer.go:2697] [noc] GET comments
ResponseHeader: fixed16
OutputFormat: json
Columns: author comment entry_time entry_type expires expire_time id is_service persistent source type host_name service_description

[2019-12-23 00:41:07][Error][peer.go:2698] [noc] LastResponse:
[2019-12-23 00:41:07][Error][peer.go:2699] [noc] [["srvadmin","Einai proswrina Down",1576963609.0,4.0,false,0.0,1.0,0.0,1.0,1.0,1.0,"web2.noc",null]]

TLS support for TLS enabled livestatus

Hi,

does LMD work with livestatus backends which are listening on TLS enabled ports? If yes, can you please point to the documentation on how do I use the certs to talk to TLS enabled livestatus backends ?

thanks

Checkmk 1.6 labels missing

Type: Wishlist item

Checkmk, starting from version 1.6, introduced host and service labels. When connecting two Checkmk 1.6 sites through LMD, the labels defined on the slave site are not visible on the master site.

Is it possible that support for labels is added to LMD?

LMD is returning an insconsistent response when used on top of Shinken

Hello. We are using LMD in our monitoring infrastructure on top of a Shinken backed Livestatus server. So far the results have been getting better and better with each release, with significant performance gains, specially when monitoring large clusters and a big amount of hosts and services.

However, ever since updating Shinken from v1.3.0 to v1.3.3 we've noticed that some fields returns different results than the ones that are yielded by the underlying Shinken engine. As an example, here is a Livestatus query sent to LMD 1.3.3:

danirod@danirod-X541U:~/go/src/github.com/sni/lmd/lmd -> nc localhost 3333
GET hosts
Columns: name is_impact realm

[["localhost",[],""]
]

And here is the same query sent to the underlying Shinken engine:

danirod@danirod-X541U:~/go/src/github.com/sni/lmd/lmd -> nc 192.168.33.20 50000
GET hosts
OutputFormat: json
Columns: name is_impact realm

[["localhost",0,"All"]]

As can be seen:

  • realm changes from "All" (Shinken) to "" (LMD).
  • is_impact even changes the datatype, from 0 (Shinken) to [ ] (LMD).

If I use the LMD 1.3.0 binary instead of LMD 1.3.3 I cannot replicate this issue anymore:

danirod@danirod-X541U:~/go/src/github.com/sni/lmd/lmd -> nc localhost 3333
GET hosts
Columns: name is_impact realm

[["localhost",0,"All"]
]

Further investigation of this issue has made me notice the following. PR #26 added support for new fields that are specific to Naemon.

However, given the query GET hosts, which includes the columns list at the top of the response, the following list is given by LMD 1.3.3. (I made my best to made the following block as readable as possible; I know column lists are a nightmare to read and debug but, please, bear with me):

"accept_passive_checks" "acknowledged" "action_url" "action_url_expanded" "active_checks_enabled" "address" "alias"   
"check_command" "check_freshness" "check_interval" "check_options" "check_period" "check_type" "checks_enabled"       
"childs" "contacts" "contact_groups" "comments" "current_attempt" "current_notification_number" "custom_variables"    
"custom_variable_names" "custom_variable_values" "display_name" "downtimes" "event_handler" "event_handler_enabled"   
"execution_time" "first_notification_delay" "flap_detection_enabled" "groups" "hard_state" "has_been_checked"         
"high_flap_threshold" "icon_image" "icon_image_alt" "icon_image_expanded" "in_check_period" "in_notification_period"  
"is_executing" "is_flapping" "last_check" "last_hard_state" "last_hard_state_change" "last_notification" "last_state" 
"last_state_change" "last_time_down" "last_time_unreachable" "last_time_up" "latency" "long_plugin_output"            
"low_flap_threshold" "max_check_attempts" "modified_attributes" "modified_attributes_list" "name" "next_check"        
"next_notification" "num_services" "num_services_crit" "num_services_ok" "num_services_pending" "num_services_unknown"
"num_services_warn" "notes" "notes_expanded" "notes_url" "notes_url_expanded" "notification_interval"                 
"notification_period" "notifications_enabled" "obsess_over_host" "parents" "percent_state_change" "perf_data"         
"plugin_output" "process_performance_data" "retry_interval" "scheduled_downtime_depth" "state" "state_type"           
"staleness" "pnpgraph_present" "obsess" "is_impact" "source_problems" "impacts" "criticity" "is_problem" "realm"      
"poller_tag" "got_business_rule" "parent_dependencies" "lmd_last_cache_update" "peer_key" "peer_name"                 
"last_state_change_order" "has_long_plugin_output"                                                                    

As can be seen, LMD is including both columns marked in objects.go as optional and used by the Shinken backend, like poller_tag or is_impact...

lmd/lmd/objects.go

Lines 643 to 651 in 7a85b01

t.AddOptColumn("is_impact", DynamicUpdate, IntCol, Shinken, "Whether the host state is an impact or not (0/1)")
t.AddOptColumn("source_problems", DynamicUpdate, StringListCol, Shinken, "The name of the source problems (host or service)")
t.AddOptColumn("impacts", DynamicUpdate, StringListCol, Shinken, "List of what the source impact (list of hosts and services)")
t.AddOptColumn("criticity", DynamicUpdate, IntCol, Shinken, "The importance we gave to this host between the minimum 0 and the maximum 5")
t.AddOptColumn("is_problem", DynamicUpdate, IntCol, Shinken, "Whether the host state is a problem or not (0/1)")
t.AddOptColumn("realm", DynamicUpdate, StringCol, Shinken, "Realm")
t.AddOptColumn("poller_tag", DynamicUpdate, StringCol, Shinken, "Poller Tag")
t.AddOptColumn("got_business_rule", DynamicUpdate, IntCol, Shinken, "Whether the host state is an business rule based host or not (0/1)")
t.AddOptColumn("parent_dependencies", DynamicUpdate, StringCol, Shinken, "List of the dependencies (logical, network or business one) of this host.")

...but also the obsess column, which should be specific to Naemon.

t.AddOptColumn("obsess", DynamicUpdate, IntCol, Naemon, "The obsessing over host")

I've already placed a few logging statements in peer.go to verify that LMD is correctly identifying our Livestatus backend as Shinken, and indeed this is done OK.

Believe it or not, I could workaround this issue yesterday by swapping the order of the obsess column and moving it after the declaration of the Shinken specific columns, such as:

diff --git a/lmd/objects.go b/lmd/objects.go
index f2efe10..7d86ea1 100644
--- a/lmd/objects.go
+++ b/lmd/objects.go
@@ -636,9 +636,6 @@ func NewHostsTable() (t *Table) {
        t.AddColumn("staleness", DynamicUpdate, FloatCol, "Staleness indicator for this host")
        t.AddColumn("pnpgraph_present", DynamicUpdate, IntCol, "The pnp graph presence (0/1)")
 
-       // naemon specific
-       t.AddOptColumn("obsess", DynamicUpdate, IntCol, Naemon, "The obsessing over host")
-
        // shinken specific
        t.AddOptColumn("is_impact", DynamicUpdate, IntCol, Shinken, "Whether the host state is an impact or not (0/1)")
        t.AddOptColumn("source_problems", DynamicUpdate, StringListCol, Shinken, "The name of the source problems (host or service)")
@@ -650,6 +647,9 @@ func NewHostsTable() (t *Table) {
        t.AddOptColumn("got_business_rule", DynamicUpdate, IntCol, Shinken, "Whether the host state is an business rule based host or not (0/1)")
        t.AddOptColumn("parent_dependencies", DynamicUpdate, StringCol, Shinken, "List of the dependencies (logical, network or business one) of this host.")
 
+       // naemon specific
+       t.AddOptColumn("obsess", DynamicUpdate, IntCol, Naemon, "The obsessing over host")
+
        t.AddColumn("lmd_last_cache_update", RefNoUpdate, VirtCol, "Timestamp of the last LMD update of this object.")
        t.AddColumn("peer_key", RefNoUpdate, VirtCol, "Id of this peer")
        t.AddColumn("peer_name", RefNoUpdate, VirtCol, "Name of this peer")
@@ -771,9 +771,6 @@ func NewServicesTable() (t *Table) {
        t.AddColumn("staleness", DynamicUpdate, FloatCol, "Staleness indicator for this host")
        t.AddColumn("pnpgraph_present", DynamicUpdate, IntCol, "The pnp graph presence (0/1)")
 
-       // naemon specific
-       t.AddOptColumn("obsess", DynamicUpdate, IntCol, Naemon, "The obsessing over service")
-
        // shinken specific
        t.AddOptColumn("is_impact", DynamicUpdate, IntCol, Shinken, "Whether the host state is an impact or not (0/1)")
        t.AddOptColumn("source_problems", DynamicUpdate, StringListCol, Shinken, "The name of the source problems (host or service)")
@@ -785,6 +782,9 @@ func NewServicesTable() (t *Table) {
        t.AddOptColumn("got_business_rule", DynamicUpdate, IntCol, Shinken, "Whether the service state is an business rule based host or not (0/1)")
        t.AddOptColumn("parent_dependencies", DynamicUpdate, StringCol, Shinken, "List of the dependencies (logical, network or business one) of this service.")
 
+       // naemon specific
+       t.AddOptColumn("obsess", DynamicUpdate, IntCol, Naemon, "The obsessing over service")
+
        t.AddRefColumn("hosts", "host", "name", "host_name")
 
        t.AddColumn("lmd_last_cache_update", RefNoUpdate, VirtCol, "Timestamp of the last LMD update of this object.")

yielding the following successful request when querying on v1.3.3 - (EDIT: Yeah, I copied the wrong output from my terminal ๐Ÿ˜ž )

danirod@danirod-X541U:~/go/src/github.com/sni/lmd/lmd -> nc localhost 3333
GET hosts
Columns: name is_impact realm

[["localhost",0,"All"]
]

However, while this works, I believe the issue has a deeper and more reasonable explanation than just "you have to swap the columns order". That's why I've refrained from sending this diff as a PR. (Or maybe this actually fixes this, but I'd like to have a second answer on this).

site is offline and getting read tcp i/o timeout errors

site is offline and getting read tcp i/o timeout errors

Our setup: Multiple nodes running OMD 1.30 & 2.20-labs-edition with latest Livestatus version: 1.2.8p18 (updated 2 days back). And have setup lmd (version 1.0.2) on a separate node which would fetch livestatus data for all monitoring nodes.

One of the monitoring node always shows offline in lmd.log, though running queries to livestatus port (6557) on that node's will work. Livestatus data on that remote node is around 72M (services table with all columns) & 700K (hosts table with all columns)

incomplete data with Icinga2 with lmd and thruk

Hi,
I try to use lmd to access livestatus from two icinga2 instances. It works partly, but the data is not sorted and incomplete in some thruk tabs.

I have a hostgroup "all-hosts" in icinga, which returns all hosts when clicking the "Host Groups" Tab in "Current Status". When I click "Hosts" in "Current Status", not all hosts are shown. Same thing for "Services", not all hosts and not all services for each host is returned.

Any advise?

I had to use the workaround from issue #92 to get any data at all.

Thruk Version 2.32-3
Icinga2 Version: 2.11.2-1
LMD Version: lmd - version 1.7.1 (Build: )

/etc/thruk/thruk_local.conf:

<Component Thruk::Backend>
    <peer>
        name    = Icinga2 HA Cluster
        type    = livestatus
        <options>
            peer   = ip1:6558
            peer   = ip2:6558
        </options>
    </peer>
</Component>

shown_inline_pnp = 1

logcache = mysql://thruk:<pw>@<db-ip>:3306/thruk_logs
logcache_fetchlogs_command = IDO_DB_HOST=<db-ip> IDO_DB_PORT=3306 IDO_DB_USER=icinga IDO_DB_PW=<db-pw> IDO_DB_NAME=icinga /usr/share/thruk/support/icinga2_ido_fetchlogs.sh mysql

/etc/thruk/thruk_local.d/lmd.conf:

use_lmd_core=1
lmd_core_bin=/opt/local/go/bin/lmd

Thank you,
Christoph

No backend available in Thruk

I came across the bug sni/Thruk#641 because I'm experiencing the same for a certain service which sometimes contains malformed characters in the critical service output.

I wanted to give LMD a try to be the intermediary between Thruk and Icinga2's Livestatus.

Thruk -> LMD -> Icinga2 Livestatus

After I installed the lmd binary into /usr/bin/lmd, I created/adapted the following configs:

/etc/thruk/lmd.ini:

# Listen for incoming livestatus requests here
# TCP or unix sockets are allowed. Multiple entries are also valid.
# An http address can be defined as well.
Listen          = ["127.0.0.1:3333", "/tmp/lmd.sock"]

# List of cluster nodes (cluster mode).
# All cluster nodes must have their http server enabled (see Listen).
# A bare ip address may be provided if the port is the same on all nodes.
#Nodes           = ["10.0.0.1", "http://10.0.0.2:8080"]

# Timeout for incoming client requests on `Listen` threads
ListenTimeout = 60

# daemon will log to stdout if no logfile is set
LogFile         = "/tmp/lmd.log"

# May be Error, Warn, Info, Debug and Trace
LogLevel        = "Debug"

# After this amount of seconds, a backend will be marked down when there
# is no response
StaleBackendTimeout = 30

# Refresh remote sites every x seconds.
# Fast updates are ok, only changed hosts and services get fetched
# and once every `FullUpdateInterval` everything gets updated.
UpdateInterval = 5

# Run a full update on all objects every x seconds. Set to zero to turn off
# completly. This is usually not required and only needed if for uncommon
# reasons some updates slip through the normal delta updates.
FullUpdateInterval = 600

# After `IdleTimeout` seconds of no activity (incoming querys for this backend)
# the slower update interval of `IdleInterval` seconds will be used.
# Don't set the timeout to low, clients will have to wait for a "spin up"
# query on the first access after idling.
IdleTimeout = 120
IdleInterval = 1800

# Connection timeout settings for remote connections.
# `ConnectTimeout` will be used when opening and testing
# the initial connection and `NetTimeout` is used for transfering data.
ConnectTimeout = 30
NetTimeout = 120

# Skip ssl certificate verification on https remote backends.
# Set to 1 to disabled any ssl verification checks.
SkipSSLCheck = 1

# use tcp connections with multiple sources for clusters
[[Connections]]
name   = "Icinga2 Test Cluster"
id     = "id1"
source = ["10.166.100.13:6558"]

/etc/thruk/thruk_local.d/lmd.cfg:

use_lmd_core=1
lmd_core_bin=/usr/bin/lmd
lmd_core_config=/etc/thruk/lmd.ini

I adjusted the backend in /etc/thruk/thruk_local.conf:

############################################
# put your own settings into this file
# settings from this file will override
# those from the thruk.conf
############################################
first_day_of_week = 1
enable_icinga_features = 1
logcache = mysql://thruk:thruk@localhost:3306/thruk_logs

<Component Thruk::Backend>
    <peer>
        name    = icinga2
        id      = 7215e
        type    = livestatus
        <options>
            peer          = localhost:3333
        </options>
    </peer>
</Component>

And then restarted Apache2.

After I accessed Thruk in the browser, the lmd process was correctly started:

www-data 20401 0.0 0.0 1275160 23956 ? Sl 11:16 0:00 /usr/bin/lmd -pidfile /var/cache/thruk/lmd/pid -config /var/cache/thruk/lmd/lmd.ini -config /etc/thruk/lmd.ini

But now Thruk tells me that there's no backend available:

No Backend available
None of the configured Backends could be reached, please have a look at the logfile for detailed information and make sure the core is up and running.

Details:
icinga2: bad request: backend 7215e does not exist (localhost:3333)

Logs show:

# tail -f /var/log/thruk/thruk.log /tmp/lmd.log
==> /var/log/thruk/thruk.log <==
[2019/01/17 11:26:10][inf-mon02-t][INFO] 21132 Req: 001   mem:  39.79 MB   2.22 MB   dur: 0.018s       (-/0.001s)   size:    0.013 kb   stat: 200   url: remote.cgi?startup
[2019/01/17 11:26:10][inf-mon02-t][INFO] 21132 Req: 002   mem:  40.75 MB   0.87 MB   dur: 0.015s  (0.001s/0.009s)   size:   11.787 kb   stat: 200   url: side.html?_=1547720762236
[2019/01/17 11:26:11][inf-mon02-t][INFO] 21132 Req: 003   mem:  49.52 MB   8.77 MB   dur: 0.067s  (0.001s/0.014s)   size:   26.337 kb   stat: 200   url: conf.cgi?sub=backends
[2019/01/17 11:26:22][inf-mon02-t][INFO] 21256 Req: 001   mem:  37.90 MB   0.28 MB   dur: 0.004s       (-/0.001s)   size:    0.015 kb   stat: 200   url: restricted.cgi
[2019/01/17 11:26:22][inf-mon02-t][ERROR] ***************************
[2019/01/17 11:26:22][inf-mon02-t][ERROR] No Backend available
[2019/01/17 11:26:22][inf-mon02-t][ERROR] icinga2: bad request: backend 7215e does not exist (localhost:3333)
[2019/01/17 11:26:22][inf-mon02-t][ERROR] on page: http://inf-mon02-t/thruk/cgi-bin/tac.cgi
[2019/01/17 11:26:22][inf-mon02-t][ERROR] User: thrukadmin
[2019/01/17 11:26:22][inf-mon02-t][INFO] 21256 Req: 002   mem:  39.72 MB   1.72 MB   dur: 0.021s  (0.001s/0.015s)   size:   12.491 kb   stat: 500   url: tac.cgi

==> /tmp/lmd.log <==
[2019-01-17 11:26:35][Debug][peer.go:1221] [Icinga2 Test Cluster] got hosts answer: size: 2 kB
[2019-01-17 11:26:35][Debug][peer.go:886] [Icinga2 Test Cluster] updated 2 hosts
[2019-01-17 11:26:35][Debug][peer.go:1221] [Icinga2 Test Cluster] got services answer: size: 1 kB
[2019-01-17 11:26:35][Debug][peer.go:1221] [Icinga2 Test Cluster] got services answer: size: 2 kB
[2019-01-17 11:26:35][Debug][peer.go:937] [Icinga2 Test Cluster] updated 4 services
[2019-01-17 11:26:35][Debug][peer.go:1221] [Icinga2 Test Cluster] got comments answer: size: 0 kB
[2019-01-17 11:26:35][Debug][peer.go:1073] [Icinga2 Test Cluster] comments did not change
[2019-01-17 11:26:35][Debug][peer.go:1221] [Icinga2 Test Cluster] got downtimes answer: size: 0 kB
[2019-01-17 11:26:35][Debug][peer.go:1073] [Icinga2 Test Cluster] downtimes did not change
[2019-01-17 11:26:35][Debug][peer.go:826] [Icinga2 Test Cluster] delta update complete in: 5.025975ms
[2019-01-17 11:26:40][Debug][peer.go:1221] [Icinga2 Test Cluster] got status answer: size: 0 kB
[2019-01-17 11:26:40][Debug][peer.go:1221] [Icinga2 Test Cluster] got hosts answer: size: 0 kB
[2019-01-17 11:26:40][Debug][peer.go:886] [Icinga2 Test Cluster] updated 0 hosts
[2019-01-17 11:26:40][Debug][peer.go:1221] [Icinga2 Test Cluster] got services answer: size: 0 kB
[2019-01-17 11:26:40][Debug][peer.go:937] [Icinga2 Test Cluster] updated 0 services
[2019-01-17 11:26:40][Debug][peer.go:1221] [Icinga2 Test Cluster] got comments answer: size: 0 kB
[2019-01-17 11:26:40][Debug][peer.go:1073] [Icinga2 Test Cluster] comments did not change
[2019-01-17 11:26:40][Debug][peer.go:1221] [Icinga2 Test Cluster] got downtimes answer: size: 0 kB
[2019-01-17 11:26:40][Debug][peer.go:1073] [Icinga2 Test Cluster] downtimes did not change
[2019-01-17 11:26:40][Debug][peer.go:826] [Icinga2 Test Cluster] delta update complete in: 3.135669ms
[2019-01-17 11:26:45][Debug][peer.go:1221] [Icinga2 Test Cluster] got status answer: size: 0 kB
[2019-01-17 11:26:45][Debug][peer.go:1221] [Icinga2 Test Cluster] got hosts answer: size: 1 kB
[2019-01-17 11:26:45][Debug][peer.go:886] [Icinga2 Test Cluster] updated 1 hosts
[2019-01-17 11:26:45][Debug][peer.go:1221] [Icinga2 Test Cluster] got services answer: size: 0 kB
[2019-01-17 11:26:45][Debug][peer.go:937] [Icinga2 Test Cluster] updated 0 services
[2019-01-17 11:26:45][Debug][peer.go:1221] [Icinga2 Test Cluster] got comments answer: size: 0 kB
[2019-01-17 11:26:45][Debug][peer.go:1073] [Icinga2 Test Cluster] comments did not change
[2019-01-17 11:26:45][Debug][peer.go:1221] [Icinga2 Test Cluster] got downtimes answer: size: 0 kB
[2019-01-17 11:26:45][Debug][peer.go:1073] [Icinga2 Test Cluster] downtimes did not change
[2019-01-17 11:26:45][Debug][peer.go:826] [Icinga2 Test Cluster] delta update complete in: 3.077227ms
^C

But a manual livestatus test works (on both socket and tcp listener):

# unixcat /tmp/lmd.sock 
GET timeperiods

[["alias","name","in","days","exceptions_calendar_dates","exceptions_month_date","exceptions_month_day","exceptions_month_week_day","exceptions_week_day","exclusions","id","lmd_last_cache_update","peer_key","peer_name"]
,
[1547721308,"never",0,[],[],[],[],[],[],[],-1,1547720211,"id1","Icinga2 Test Cluster"]
,[1547721308,"businesshours",1,[],[],[],[],[],[],[],-1,1547720211,"id1","Icinga2 Test Cluster"]
,[1547721308,"shortbusinesshours",1,[],[],[],[],[],[],[],-1,1547720211,"id1","Icinga2 Test Cluster"]
,[1547721308,"24x7",1,[],[],[],[],[],[],[],-1,1547720211,"id1","Icinga2 Test Cluster"]
]

# telnet localhost 3333
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET timeperiods

[["alias","name","in","days","exceptions_calendar_dates","exceptions_month_date","exceptions_month_day","exceptions_month_week_day","exceptions_week_day","exclusions","id","lmd_last_cache_update","peer_key","peer_name"]
,
[1547721360,"never",0,[],[],[],[],[],[],[],-1,1547720211,"id1","Icinga2 Test Cluster"]
,[1547721360,"businesshours",1,[],[],[],[],[],[],[],-1,1547720211,"id1","Icinga2 Test Cluster"]
,[1547721360,"shortbusinesshours",1,[],[],[],[],[],[],[],-1,1547720211,"id1","Icinga2 Test Cluster"]
,[1547721360,"24x7",1,[],[],[],[],[],[],[],-1,1547720211,"id1","Icinga2 Test Cluster"]
]
Connection closed by foreign host.

Versions used:

go version go1.11.4 linux/amd64
thruk 2.26-2
lmd - version 1.4.3 (Build: )
icinga2 2.10.2-1.xenial

Any ideas how to solve this?

lmd crashes with Panic: runtime error: index out of range

Probably related to Icinga/icinga2#5626 and sni/Thruk#762

lmd v1.1.2 crashes immediately when trying to start it agains Icinga 2.7.1:

[2017-10-10 09:22:57][Error][main.go:515] Panic: runtime error: index out of range
[2017-10-10 09:22:57][Error][main.go:516] goroutine 27 [running]:
runtime/debug.Stack(0x0, 0x0, 0x0)
        /usr/lib/go-1.6/src/runtime/debug/stack.go:24 +0x80
main.logPanicExit()
        /local/hudson/workspace/sua-lmd/label/amd64_ubuntu1604/section/unstable/sua-lmd/build-area/sua-lmd-1.0/go-path/src/github.com/sni/lmd/lmd/main.go:516 +0xe0
panic(0x94d320, 0xc820010060)
        /usr/lib/go-1.6/src/runtime/panic.go:443 +0x4e9
main.(*Peer).UpdateDeltaCommentsOrDowntimes(0xc8200fe4b0, 0xa07c60, 0x8, 0x0, 0x0)
        /local/hudson/workspace/sua-lmd/label/amd64_ubuntu1604/section/unstable/sua-lmd/build-area/sua-lmd-1.0/go-path/src/github.com/sni/lmd/lmd/peer.go:723 +0x1cf8
main.(*Peer).UpdateDeltaTables(0xc8200fe4b0, 0xa01a80)
        /local/hudson/workspace/sua-lmd/label/amd64_ubuntu1604/section/unstable/sua-lmd/build-area/sua-lmd-1.0/go-path/src/github.com/sni/lmd/lmd/peer.go:497 +0x1c4
main.(*Peer).periodicUpdate(0xc8200fe4b0, 0xc823af1dcd, 0xc823af1dd0)
        /local/hudson/workspace/sua-lmd/label/amd64_ubuntu1604/section/unstable/sua-lmd/build-area/sua-lmd-1.0/go-path/src/github.com/sni/lmd/lmd/peer.go:328 +0x4f4
main.(*Peer).updateLoop(0xc8200fe4b0)
        /local/hudson/workspace/sua-lmd/label/amd64_ubuntu1604/section/unstable/sua-lmd/build-area/sua-lmd-1.0/go-path/src/github.com/sni/lmd/lmd/peer.go:273 +0x8aa
main.(*Peer).Start.func1(0xc8200fe4b0)
        /local/hudson/workspace/sua-lmd/label/amd64_ubuntu1604/section/unstable/sua-lmd/build-area/sua-lmd-1.0/go-path/src/github.com/sni/lmd/lmd/peer.go:188 +0x46
created by main.(*Peer).Start
        /local/hudson/workspace/sua-lmd/label/amd64_ubuntu1604/section/unstable/sua-lmd/build-area/sua-lmd-1.0/go-path/src/github.com/sni/lmd/lmd/peer.go:193 +0x2c8

is Listen only supported in the first config file?

Is it intended behaviour, that the Listen config directive is only loaded from the first config file found?

In a recent setup I planned to use the thruk-managed lmd, and inject an additional TCP Livestatus listener via Thruks's lmd_core_config option.
But I found that lmd was not opening the socket I specified in the additional config file, only the socket specified in the config file maintained by Thruk.

For my installation I worked around that by using lmd_options = -o Listen=0.0.0.0:3333 in Thruk, which opened the TCP listener as I hoped.

Looking at the testcase in lmd/main_test.go:TestMainConfig() it looks like this is intended behaviour, but I might read that wrong since I'm not really go-literate.

livestatus waiting examples don't work with lmd

The livestatus documentation at
https://mathias-kettner.de/checkmk_livestatus.html
says that it is valid to use a Wait Trigger without a WaitCondition.

but lmd says
bad request: WaitTrigger without WaitCondition

I wanted to use the example for polling the logs
GET log
Filter: time >= 1265062900
WaitTrigger: log
WaitTimeout: 2000
which works for naemon but not for LMD.

To me this query seems like a good way to push logs into an external logging system.

Is this intentional? Are you protecting lmd (and me) from doing something stupid?

regexp filter partly not working

It seems that some regexp filter (Filter: host_name ~~ searchfilter) are not working with the lmd livestatus interface while they're working with naemon-livestatus.

Example hostnames:

srvaa01, srvbb01, srvcc01,
srvaa02, srvbb02, srvcc02,
... 03, 04, etc.

Works with both:

Filter: host_name ~~ srv.*01

Works with naemon, but not with lmd

Filter: host_name ~~ srv..01

Works with lmd, but not with naemon

Filter: host_name ~~ srv.{2}01

It looks like a single . (dot) without * or + or {count} doesn't get accepted in the filter syntax and lmd returns an empty resultset

Cluster mode

Good day!
We trying to start LMD in our system, check-mk, and trying to change it a little.

Can you help me with cluster mode and give more information about this?
I've tried to run LMD in this mode, but it wasn't successful.

I started to analyze your code in nodes.go and encountered problem.

  1. If number of backends < number of nodes then program crashes on function redistribute (nodes.go) on this line:
    list[j] = n.backends[distributedCount+j]

The text of error: Panic: runtime error: index out of range

  1. If number of backends > number of nodes then LMD works good, but in function redistribute we
    distribute backends between nodes and backends start if only node (which it attached) has value true of attribute isMe (struct NodeAddress).

So we don't get information from other backends in cluster mode?
May be I did something wrong or didn't understand this.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.