Git Product home page Git Product logo

simplemonitor's People

Contributors

andrewmcguinness avatar andronkyr avatar cabalist avatar ccremer avatar cgroschupp avatar cpina avatar danieldh206 avatar danieleteti avatar dependabot-preview[bot] avatar dependabot-support avatar dependabot[bot] avatar energieloesungen avatar error454 avatar hslatman avatar jamesoff avatar jmcclelland avatar makered avatar minektur avatar mrgfisher avatar nguyen127001 avatar pheuzoune avatar progval avatar r4r3dev avatar rarosalion avatar shakreiner avatar simeonfelis avatar slomkowski avatar snyk-bot avatar wsw70 avatar wxcafe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simplemonitor's Issues

MonitorHost not working on non english Windows OS

On non english windows OS MonitorHost will not work as output of ping is language dependant.
Ex, in French:

C:\Users\tlegras>ping -n 1 -w 5000 127.0.0.1

Envoi d'une requête 'Ping'  127.0.0.1 avec 32 octets de données :
Réponse de 127.0.0.1 : octets=32 temps<1ms TTL=128

Statistiques Ping pour 127.0.0.1:
    Paquets : envoyés = 1, reçus = 1, perdus = 0 (perte 0%),
Durée approximative des boucles en millisecondes :
    Minimum = 0ms, Maximum = 0ms, Moyenne = 0ms

C:\Users\tlegras>ping -n 1 -w 5000 128.0.0.1

Envoi d'une requête 'Ping'  128.0.0.1 avec 32 octets de données :
Réponse de 192.168.169.254 : Impossible de joindre le réseau de destination.

Statistiques Ping pour 128.0.0.1:
    Paquets : envoyés = 1, reçus = 1, perdus = 0 (perte 0%),

Unfortunatly I don't have any solution that would work on any windows workstation. solution could be to make MonitorHost.ping_regexp configurable. in my case in network.py line 172 change :
self.ping_regexp = "Reply from "
with
self.ping_regexp = "Réponse de "
(caution to encoding...)

service status monitor flavors

There is variety of flavors how to actuali check nix service status but now only /usr/local/etc/rc.d/ script is run. E.g. me on ubuntu based distro can check service * status or /etc/init.d/* status ...
I would suggest to have list of commands that are tried to run and if neither success, error is raised.

gap ignored

gap doesn't seem to be honoured. Not being a phython coder at all, but this seems to fix it (sorry, I know, should be a pull request, but I'm also not a github guy).

*** Monitors/monitor.py-orig    Thu Sep  1 15:06:41 2016
--- Monitors/monitor.py Thu Sep  1 15:07:34 2016
***************
*** 77,82 ****
--- 77,84 ----
              self.set_remote_alerting(int(config_options["remote_alerts"]))
          if 'recover_command' in config_options:
              self.set_recover_command(config_options["recover_command"])
+         if 'gap' in config_options:
+             self.set_gap(config_options["gap"])
          self.running_on = self.short_hostname()
          self.name = name

(Opt) enhancement: allow alerters to repeat their message ...

I'm a fan of getting flooded with alarms if things go wrong - just to make sure they will not be oversee (e.g. if mailbox is filled up with other stuff).

This quick hack adds the option "repeat" to the alerter. If not 0, the alerter keeps on sending alarms - but not during OOH - honouring the configured limit - the virtual failure count must be an int multiple of it.

Markus

*** Alerters/alerter.py-orig    Fri Sep  2 11:36:53 2016
--- Alerters/alerter.py Fri Sep  2 12:17:53 2016
***************
*** 12,17 ****
--- 12,18 ----
      hostname = gethostname()
      available = False
      limit = 1
+     repeat = 0

      days = range(0, 7)
      times_type = "always"
***************
*** 34,39 ****
--- 35,42 ----
              self.set_dependencies([x.strip() for x in config_options["depend"].split(",")])
          if 'limit' in config_options:
              self.limit = int(config_options["limit"])
+         if 'repeat' in config_options:
+             self.repeat = int(config_options["repeat"])
          if 'times_type' in config_options:
              times_type = config_options["times_type"]
              if times_type == "always":
***************
*** 133,140 ****
                              return "catchup"
                          else:
                              return "failure"
!             if monitor.virtual_fail_count() == self.limit:
!                 # This is the first time we've failed
                  if out_of_hours:
                      if monitor.name not in self.ooh_failures:
                          self.ooh_failures.append(monitor.name)
--- 136,143 ----
                              return "catchup"
                          else:
                              return "failure"
!             if monitor.virtual_fail_count() == self.limit or (self.repeat and (monitor.virtual_fail_count() % self.limit == 0)):
!                 # This is the first time or nth time we've failed
                  if out_of_hours:
                      if monitor.name not in self.ooh_failures:
                          self.ooh_failures.append(monitor.name)

docker support

Hi there

I just wanted to inform you that I created a docker image for this project. It's available on Docker Hub or on github. If you make you own build, you can connect docker hub with github so that a new image gets automatically built upon pushing or tagging a commit.

cheers

Restarting Application with fail_command

I am trying to have the monitor watch a TCP connection and if it fails to restart the application that initially starts the connection. The way I am currently doing this is by running the .py file through command line using the fail_command in the monitor.ini file but once it does this, the monitor gets stuck in that application resulting in the monitor not watching the reconnected connection.

Any recommendations to have the monitor watch that reconnected connection? First thought is dealing with threading but wondering if there is another way.

{virtual_fail_count} is always 0

Hi,
I've use simplemonitor since few week, and I saw that into the fail_command and success_command (into the file monitors.ini) keep the variable {virtual_fail_count} to zero.
Do you have an idea how to debug that ?
I'm on the branch feature/python3. (But it was the same on the master branch)

Obtain and log public IP address

I have a situation, where an internet connection has an automatic failover to a 4G/SIM based connection, which takes about a minute to establish a connection, and will provide another public IP address, which may cause connection issues to certain services.

Therefore, I would like to monitor and log such a change of public IP address.

It would be nice to add an option that allows the public IP address to be obtained and logged.

This requires an external web service to be accessed, whose URL needs to be made a parameter.

To extract the public IP address from the service response, you could either simply use a regex to extract the first IPv4-formatted string in the response ('[0-9]{1,3}.{3}[0-9]{1,3}'), or allow a regex parameter to be supplied.
Possible public ip address services:

Command Monitor

I think what is missing is a generic monitor that would allow to launch a preconfigured command (in monitors.ini) and check result using a regexp. This would allow to monitor virtually anything that is not offered by the list of monitor. This monitor could also check if the output is balow a max value (to monitor memory, disk, cpu ...)

Ex1: monitor if a some process is running

command="ps auxww | grep mysoftware"
result_regexp="mysoftware -myparam"

Ex2: if command is returning a value, test if the value is below a given value. In this example we count the number of postgres connection and we check it is below max connection

[postgresql-connectionmax-monitor]
command="ps auxww | grep ^postgres | wc -l"
result_max=100

I have already coded such monitor, I can send you if interesting (not under git however, but this is just one file) + 2 lines in monitor.py

https with Python 3

In python3, into you requirement.txt, you need to add pyOpenSSL.

Without this library, requests is really slower with website how have strong cipher.

Error in startup

Hello,
I get this when i run monitor.py
SimpleMonitor v1.7
--> Loading main config from monitor.ini
--> Loading monitor config from monitors.ini
Unable to trap SIGHUP... maybe it doesn't exist on this platform.
No monitors loaded :(

My monitor.ini looks like this 👍
[monitor]
interval=60

[reporting]
loggers=logfile

[logfile]
type=logfile
filename=monitor.log
only_failures=1

[dummyhost-ping]
type=host
host=192.168.1.3
tolerance=2

Am currently using PYTHON 2.7.13

[Windows] UnboundLocalError: local variable 'certfile' referenced before assignment

When running on Windows 7, I get the following error log on startup:

>python monitor.py
SimpleMonitor v1.7
--> Loading main config from monitor.ini
--> Loading monitor config from monitors.ini
Unable to trap SIGHUP... maybe it doesn't exist on this platform.
Traceback (most recent call last):
  File "monitor.py", line 407, in <module>
    main()
  File "monitor.py", line 307, in main
    m = load_monitors(m, monitors_file, options.quiet)
  File "monitor.py", line 118, in load_monitors
    new_monitor = Monitors.network.MonitorHTTP(monitor, config_options)
  File "C:\Users\me\Downloads\jamesoff-simplemonitor-v1.6-27-g8e5249f\jamesoff-simplemonitor-8e5249f\Monitors\network.py", line 82, in __init__
    self.certfile = certfile
UnboundLocalError: local variable 'certfile' referenced before assignment

I fixed it by moving lines 82-83 in Monitors/network.py to the end of the if 'certfile' in config_options: block starting on line 66.

This allows my non-https monitor to work properly, but I haven't tested it to see if that change breaks https monitors.

ping command is wrong for non-windows systems

For windows you are doing a "ping -n 1 -w 5000 %2" which sends 1 ping with a timeout of 5000ms or 5 seconds.

If it is not a windows system it basically uses "ping -c1 -t5 %s" which sets the TTL to 5, which is NOT the same as timeout. "-t" is the number of hops. I think you probably mean to use -w5 or -W5.

Python 3 support

No idea how compatible with Python 3 this is, but it would be nice it it worked.

support for SSL certificate client authentication with HTTPMonitor

Hi,
Today I needed to add support for client authentication in HTTPMonitor. I did the modifications with a local copy of git repo (basic support, password protected keyfile are not supported). I am ready to deliver it if interested...
If so... just note that I am a begginer with github :)
I tried to take a look at pull request button: it seems a branch must be created before? I have worked in a branch locally on my workstation, is there some kind of sync to propagate the branch creatin on github? or is it the owner of the project who is doing that?
(i am using gitgui, not github desktop)

Multiple hosts with same monitor definition and network logging

I have an issue with a setup, where I have central monitor host generating a status page and multiple hosts monitoring diskspace (the monitor is called "diskspace" on all the hosts)
It seems that the central monitor uses just the last information it receives from the individual hosts and it displays just a single line with diskspace status, though there is about 20 hosts. The host information in the status page changes from time to time showing different information from different hosts, but never more hosts at the same time.

Do I have to have different monitor names for every host? If yes, would it be possible to use environment variables in the configuration? The motivation behind is, the hosts are automatically provisioned by the docker container of simplemonitor and I'm not able to prepare unique image for every host.

Timestamps in the log file

The current log file contains integer-type timestamps at the beginning of each line:

monitor.log:

1469763249 dr-http: ok
1469763249 dr-ping: ok
1469763260 dr-http: ok
1469763260 dr-ping: ok
1469763270 dr-http: ok
1469763270 dr-ping: ok
1469763280 dr-http: ok
1469763280 dr-ping: ok
...

When using this log file for trouble-shooting events, it is difficult to interpret these timestamps. Would it not be possible to add an option to have the timestamps formatted in a human readable format, such as ISO 8601, like so...

2016-07-29T21:12:49 dr-http: ok
2016-07-29T21:12:49 dr-ping: ok
2016-07-29T21:13:00 dr-http: ok

hierarchical tree organisation in HTMLLogger

Improvement: I am working on a modified version of HTMLLogger where monitors are organised in a tree (using CompoungMonitor). The idea is to expand/collapse top level CompoungMonitor. if interested I could propose a pull request once I have tested it for some time. Changes are mainly in the javascript, and a bit in HTMLLogger. State of the tree is preserved even when page reloads each minute :)

success_command is running in excess

As mentioned in #58 where I am using /bin/bash to email me on fail_command and success_command, I am noticing that I get 1 fail command email which is expected, but on success command email I am receiving 4 of them.

I think the expected behavior should be to receive only 1? Am I missing something?

Thanks!

Slack webhook payload configuration

I've used slack webhooks a bit and from what I understand it expects a post with some data with the message you want posted. However I am not seeing how I can configure that in the my slack alerter.

Slack alerter exception on missing channel

Getting this exception while trying to make the Slack alerting working:
exception caught while alerting for mymonitor: SlackAlerter instance has no attribute 'channel'

The same exception is generated for this config (tried this one as the channel attribute is not mandatory)

[slackalert]
type=slack
url=https://hooks.slack.com/services/my/generated/webhook

but for this one as well:

[slackalert]
type=slack
url=https://hooks.slack.com/services/my/generated/webhook
channel=alerts

Python 3 Support

Any interest in me porting the code to run on Python 3 as well? Is there a preferred way to do that? I see a branch for Python3 but it looks like it is just a travis.yml change.

Should I wait for the subprocess branch to merge?

Startup error

esoff-simplemonitor-9a863a6>python monitor.py SimpleMonitor v1.7 --> Loading main config from monitor.ini Traceback (most recent call last): File "monitor.py", line 413, in <module> main() File "monitor.py", line 284, in main interval = config.getint("monitor", "interval") File "C:\Python27\lib\ConfigParser.py", line 359, return self._get(section, int, option) File "C:\Python27\lib\ConfigParser.py", line 356, return conv(self.get(section, option)) File "C:\Python27\lib\ConfigParser.py", line 607, raise NoSectionError(section) ConfigParser.NoSectionError: No section: 'monitor'
I get this error ever time i run monitor.py. I couldn`t figure our whats wrong. please provide solution as soon as possible.

Elapsed time in log file

When performing internet-based requests (HTTP, Ping, DNS), it can be important not only to be able to detect failures, but also be able to determine the duration (milliseconds) of a request.

As a minimum, it would be nice to be able to append the duration onto each log line, e.g.

monitor.log:

1469763249 dr-http: ok (150ms)
1469763249 dr-ping: ok (42ms)
1469763260 dr-http: ok (142ms)
...

An additional option would be to be able to set a maximum acceptable duration on the completion of a monitor request, and have a failure reported if this duration is exceeded.

Unable to Run simplemonitor

I recently recloned the project and then typed
python2 monitor.py
and it gave me this:

harkishen@harkishen-Aspire-A515-51G:~/Desktop/git works/simplemonitor$ python2 monitor.py 
SimpleMonitor v1.7
--> Loading main config from monitor.ini
--> Loading monitor config from tests/monitors.ini
Adding host monitor test1
Adding fail monitor test2
Adding command monitor command1
Adding command monitor command2
Adding command monitor command3
Adding command monitor command4
Adding http monitor http
--> Loaded 7 monitors.

Adding slack alerter slack

--> Starting... (loop runs every 5s) Hit ^C to stop
error_count = 0, interval = 5 --> 0
1
error_count = 1, interval = 5 --> 1
.error_count = 2, interval = 5 --> 2
error_count = 3, interval = 5 --> 3
.error_count = 4, interval = 5 --> 4
error_count = 5, interval = 5 --> 0
.error_count = 0, interval = 5 --> 0
error_count = 1, interval = 5 --> 1
.error_count = 2, interval = 5 --> 2
error_count = 3, interval = 5 --> 3
.error_count = 4, interval = 5 --> 4
^C
--> Quitting.
--> Finished.

It failed to run. Can anyone tell me whats going on?
This is my python version :

Python 2.7.14 (default, Sep 23 2017, 22:06:14) 
[GCC 7.2.0] on linux2

ping times not displayed

I am testing on win7 with python 3.6.

last_run_duration is logged but the actual ping time is not. I looked into this and may have found something.

in simplemonitor/Monitors/network.py

within MonitorHost class, run_test function is present

    def run_test(self):
        r = re.compile(self.ping_regexp)
        r2 = re.compile(self.time_regexp)
        success = False
        pingtime = 0.0
        try:
            cmd = (self.ping_command % self.host).split(' ')
            output = subprocess.check_output(cmd)
            for line in str(output).split("\n"):
                matches = r.search(line)
                if matches:
                    success = True
                else:
                    matches = r2.search(line)
                    if matches:
                        pingtime = matches.group("ms")
        except Exception as e:
            self.record_fail(e)
            return False
        if success:
            if pingtime > 0:
                self.record_success("%sms" % pingtime)
            else:
                self.record_success()
            return True
        self.record_fail()
return False

If the first regex (r) is matched, the next one is not run. The next regex is the one that grabs the actual ping time and stores it in last_result via self.record_success.
Edit: As it happens, I overlooked the for loop. Still looking.
Edit: Problem is with:

for line in str(output).split("\n"):

output needs to be decoded like this:

for line in output.decode('utf-8').split("\n"):

ultimately, i'd like the actual ping to be logged. If the above is fixed, is the only way to print that is by modifying save_result2 in FileLogger? Or is there an option that I overlooked?

PS: How do I quote code here without literally pasting it?

Ping is showing as 'passed' when the host is 'unreachable'

Testing this on a windows machine on my home network. I assume it's user error.

I set the monitors.ini to ping my chromecast ip

[chromecast-ping]
type=host
host="ip Address of my chromecast"
tolerance=1

and my monitor.ini to

[monitor]
interval=30

[reporting]
loggers=logfile

[logfile]
type=logfile
filename=monitor.log
only_failures=0

When I run it, the log file says

1518975287 chromecast-ping: ok (2.751s)
1518975320 chromecast-ping: ok (2.996s)

But if I were to ping the IP from the cmd line, I get a 'destination host unreachable.'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.