jsocol / pystatsd Goto Github PK

View Code? Open in Web Editor NEW

537.0 12.0 177.0 247 KB

A Python client for statsd

Home Page: http://statsd.readthedocs.io/

License: MIT License

Python 100.00%

pystatsd's Introduction

A Python statsd client

statsd is a friendly front-end to Graphite. This is a Python client for the statsd daemon.

Code:	https://github.com/jsocol/pystatsd
License:	MIT; see LICENSE file
Issues:	https://github.com/jsocol/pystatsd/issues
Documentation:	https://statsd.readthedocs.io/

Quickly, to use:

>>> import statsd
>>> c = statsd.StatsClient('localhost', 8125)
>>> c.incr('foo')  # Increment the 'foo' counter.
>>> c.timing('stats.timed', 320)  # Record a 320ms 'stats.timed'.

You can also add a prefix to all your stats:

>>> import statsd
>>> c = statsd.StatsClient('localhost', 8125, prefix='foo')
>>> c.incr('bar')  # Will be 'foo.bar' in statsd/graphite.

Installing

The easiest way to install statsd is with pip!

You can install from PyPI:

$ pip install statsd

Or GitHub:

$ pip install -e git+https://github.com/jsocol/pystatsd#egg=statsd

Or from source:

$ git clone https://github.com/jsocol/pystatsd
$ cd pystatsd
$ python setup.py install

Docs

There are lots of docs in the docs/ directory and on ReadTheDocs.

pystatsd's People

Contributors

Stargazers

Watchers

Forkers

jbalogh krux chjohnst dgholz kyleconroy shreyansb whiskybar kmike mikeyk chancejiang lonelly microsigns onlytiancai jlopez werkshy hfeeki mbertheau briancline littleinc spil-jasper ustudio piecommerce dougvk jvanasco lambdafu bheilman robertomaurizzi rnt chartbeat siliconcow josemariaruiz lyft trbs fabiant7t tiriplicamihai qz267 jtratner gfreezy vinodc doismellburning khan grimborg hlangeveld fwang2002 rogerhu smorin davidblewett exit99 vukasin giosg cmalek hanleyhansen shurtha netaccesscorp singular-labs moshez shadow4125 wfxiang08 pugong ezc woodsaj aluminiumgeek smarkets deathowl wujuguang szaydel timelinelabs imoore76 haron nikolas audlaw ianw amedeedaboville lavr jiocloudcomputeadmin nihn tanveergill adamchainz kidanekal sujaynarumanchi gjcarneiro odedfos tristancacqueray tandonnaman bosondata yangchangshun granitosaurus deejay1 leoabc tussion ourbest rowillia jcsackett tienyuan poogles thefab leplatrem edwardbetts klaviyo suzaku

pystatsd's Issues

PyPi has incorrect 2.0.1 version

The library installed from PyPi says 2.0.1 but does not contain the fix for Django 1.5 and importing the settings.

Mocking statsd object while running tests

Hey @jsocol

Thanks for your detailed response on #81 . It worked for me using the context manager.

Now, I have a test suite in the project directory where I am using statsd. The method where I have my context manager is tested inside a unit test.

Whenever I run $ python setup.py test, inside the statsd client, I see that the responses are getting logged in the statsd log (I am running statsd by doing a $ node stats.js config.js 2>&1 > /tmp/statsd.log)

27 Jun 07:07:24 - DEBUG: numStats: 4
27 Jun 07:07:32 - DEBUG: services.my_service-10.0.2.15-integrationtest#0_IOPerf:0.245810|ms
27 Jun 07:07:32 - DEBUG: services.my_service-10.0.2.15-integrationtest#0_IOPerf:0.254869|ms
27 Jun 07:07:34 - DEBUG: numStats: 5
27 Jun 07:07:34 - DEBUG: services.my_service-10.0.2.15-integrationtest#0_IOPerf:0.131845|ms
27 Jun 07:07:37 - DEBUG: services.test_IOPerf:0.015020|ms
27 Jun 07:07:37 - DEBUG: services.test_IOPerf:0.016212|ms
27 Jun 07:07:37 - DEBUG: services.test_IOPerf:0.015974|ms
27 Jun 07:07:37 - DEBUG: services.test_IOPerf:0.010967|ms
27 Jun 07:07:37 - DEBUG: services.test_IOPerf:0.011206|ms
27 Jun 07:07:37 - DEBUG: services.test_IOPerf:0.008821|ms
27 Jun 07:07:37 - DEBUG: services.test_IOPerf:0.010014|ms
....
....

Something like the above. Now I don't want this to happen. How would I mock my statsd object here so as to not cause this processor overhead while testing

Method in which I have my statsd context manager, (which is being tested)

statsd = StatsClient(self.config['statsd_client'], prefix=socket.gethostname())
statsd_timer_name = '{0}_IOPerf'.format(self._consumer_tag)

with statsd.timer(statsd_timer_name):
      self.handle(json_message)
...

Any suggestions

EDIT: Minor changes

support for sending data to statsd via tcp

Statsd can be configured to accept messages over tcp (instead of udp)

Would it be easy to add support here?

ability to send stated tags

Looking at ways to send tags for a given metrics so that I can slice and dice data more easily.

It will be great if pystatsd support passing tags as follows.

statsd_client.incr('upload.request_count', tags=['env:local', 'db:remote'])

AttributeError: module 'statsd' has no attribute 'StatsClient'

I have error module 'statsd' has no attribute 'StatsClient'

# pip3 --version
pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.6)

# pip3 list| grep stats
statsd (3.3.0)

~# python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import statsd
>>> c = statsd.StatsClient('localhost', 8125)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'statsd' has no attribute 'StatsClient'

help is empty

>>> help(statsd)

output

Help on package statsd:

NAME
    statsd

PACKAGE CONTENTS
    client (package)

FILE
    (built-in)

(END)

with gevent

when I use the celery with gevent, statsd doesn't work.

Make StatsClient.timer a contextdecorator

@statsd.timer('name of thing')
def timed_method():
    with statsd.timer('other thing'):
        for i in xrange(0, 10000): pass
    return 'this would be awesome'

Gauge Delta

Add support for gauge deltas since that merged into statsd/master.

Use class variable for instantiating statsd object to use timer decorator

I am trying to use the statsd timer decorator

My file structure is something like this

├── config.py
├── consumers
│   ├── foo_consumer.py

Inside foo_consumer.py

from statsd import StatsClient
import socket

# I am specifying the statsd client IP in the file "config.py"
statsd = ""
statsd_timer_name = 'foo_consumer_IOPerf_{0}'.format(socket.gethostname())

class FooConsumer(Consumer):
    def __init__(self, config):
        # Consumer class passed on the config.py file to this class
        global statsd
        statsd = StatsClient(config['statsd_client'], 8125)

    @statsd.timer(statsd_timer_name)
    def handle(self):
        """my heavyweight function"""

This class is inherited by another class and then instantiated by it in the parent folder.

Error that I get

AttributeError: 'str' object has no attribute 'timer'

What else I have tried

It is necessary that the statsd_client IP be specified over at the config.py file

I thought about passing the statsd_client parameter to the decorator like this

class FooConsumer(Consumer):
    def __init__(self, config):
        self.statsd = StatsClient(config['statsd_client'], 8125)

    @self.statsd.timer(statsd_timer_name)
    def handle(self):
        """my heavyweight function"""

Now wouldn't work as the class hasn't been instantiated and self is not known at this point.

Any alternatives?

suggested docs warning

A common way to use Statsd is to create a client for a given process and continually reuse it. That is fine, except for an annoying implementation detail that I noticed while securing a box.

When data is first sent via socket ( https://github.com/jsocol/pystatsd/blob/master/statsd/client.py#L145-L151 ) Python appears to bind on all interfaces (e.g. 0.0.0.0) to a random port in effort of accepting a response. (sidenote: it took a while to pinpoint this onto pystatsd). It would make sense to note this inherent behavior in the docs/readme for the next person who is securing their box.

please tag v2.0.3 in git

Hello,

You have recently released v2.0.3 on pypi: https://pypi.python.org/pypi/statsd but forgot to tag it in the git repository.

That prevents the Debian package from finding that version since it relies on git tags. It uses the URL: http://githubredir.debian.net/github/jsocol/pystatsd which list v2.0.2.

Would you be kind enough to tag v2.0.3 so I can get the Debian package updated and push it in Debian/unstable ?

Thank you!

AttributeError: 'module' object has no attribute 'Connection'

i was following the example given here https://pypi.python.org/pypi/python-statsd under the advanced section.

import statsd
conn = statsd.Connection(host='metrics.test.com',port=2000)

this is giving me following error

conn = statsd.Connection(host='metrics.test.com',port=2000)
AttributeError: 'module' object has no attribute 'Connection'

can anybody explain what the issue is ?

multi-metric packets are not formatted correctly

I am seeing that if there are multiple metrics showing up in a udp packet, they are not separated by newlines, per this reference.

https://github.com/etsy/statsd/blob/master/docs/metric_types.md#multi-metric-packets

I am seeing things like X:100|gY:250|g which is misinterpreted by the daemon.
Is this configurable? Is there a work around?

please tag in git v2.0.3

Hello,

I noticed you recently released v2.0.3 but apparently forgot to tag it in git. That is preventing the Debian package from discovering the new version.

pypi shows up 2.0.3 : https://pypi.python.org/pypi/statsd

Debian tool detects only 2.0.2 : http://githubredir.debian.net/github/jsocol/pystatsd

Statsd fails hard on gaierror

When there are temporary problems with DNS access decorating a function will result in an exception, which introduces unnecessary errors when for example it's used with RQ.

    self._addr = (socket.gethostbyname(host), port)

gaierror: [Errno -2] Name or service not known

I think a better solution would be to either drop it or log it.

implement pyramid web framework default configuration

implement support for pyramid web framework default configuration as done for Django

Incorrect metrics for coroutine functions.

Code example:

In [1]: import asyncio
   ...: from statsd import StatsClient
   ...:
   ...: sc = StatsClient()
   ...:
   ...: @sc.timer("foo")
   ...: async def f():
   ...:     return await asyncio.sleep(2)
   ...:
   ...:

In [2]: %time asyncio.get_event_loop().run_until_complete(f())  # IPython %time command
   ...:
CPU times: user 917 µs, sys: 1.22 ms, total: 2.14 ms
Wall time: 2 s

Expected behavior:
~ 2000ms sent to statsd server
Current behavior:
~ 0.001ms sent to statsd server

I believe the reason of this behavior – this part of code:

pystatsd/statsd/client/timer.py

Lines 36 to 41 in 100046b

 start_time = time_now() 

 try: 

 return f(*args, **kwargs) 

 finally: 

 elapsed_time_ms = 1000.0 * (time_now() - start_time) 

 self.client.timing(self.stat, elapsed_time_ms, self.rate)

There is no awaiting for coroutine, so there is no real work in metric.

It's seems like solution should be something like (but with with backward compatibility support):

from inspect import iscoroutinefunction

# ...

def __call__(self, f):
        """Thread-safe timing function decorator."""

    if iscoroutinefunction(f):
        @safe_wraps(f)
        async def _wrapped(*args, **kwargs):
            start_time = time_now()
            try:
                return await f(*args, **kwargs)
            finally:
                elapsed_time_ms = 1000.0 * (time_now() - start_time)
                self.client.timing(self.stat, elapsed_time_ms, self.rate)
    
    else:
        @safe_wraps(f)
        def _wrapped(*args, **kwargs):
            start_time = time_now()
            try:
                return f(*args, **kwargs)
            finally:
                elapsed_time_ms = 1000.0 * (time_now() - start_time)
                self.client.timing(self.stat, elapsed_time_ms, self.rate)
    return _wrapped

Docs: Are timing decorators thread safe or not?

At http://statsd.readthedocs.org/en/latest/timing.html#using-a-decorator it says:

The timer attribute decorates your methods in a thread-safe manner.

and on http://statsd.readthedocs.org/en/latest/reference.html#timer there is a big yellow warning box that says:

Decorators are not thread-safe and may cause errors when decorated functions are called concurrently. Use context managers or raw timers instead.

I'll use a with context manager for now, just in case. Thanks for the library!

A way to change flush interval of a client

Is there a way to change flush interval in client side? Metrics are sent in every 10 seconds. I want to widen the interval into a minute or more. (Pipeline gives the control of sending data, but I am looking for a way which will be exactly what I described above in order to get rid of satisfying thread-safety with pipeline)

Unable to install into a fresh virtual environment

I have not had this issue previously. I suspect it is a 2.0 problem.

For some reason it appears that during the installation statsd is requesting settings that are not configured yet.

I am using the git version now as the PyPi version was busted.

I am more than happy to try again after a fix is made.

Here is the stacktrace of the install:

  Running setup.py develop for statsd

    Running command /opt/rh/venv/bin/python -c "import setuptools; __file__='/opt/rh/venv/src/statsd/setup.py'; exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" develop --no-deps
    Traceback (most recent call last):

      File "<string>", line 1, in <module>

      File "/opt/rh/venv/src/statsd/setup.py", line 3, in <module>

        import statsd

      File "statsd/__init__.py", line 23, in <module>

        host = getattr(settings, 'STATSD_HOST', 'localhost')

      File "/opt/rh/venv/local/lib/python2.7/site-packages/django/conf/__init__.py", line 52, in __getattr__

        self._setup(name)

      File "/opt/rh/venv/local/lib/python2.7/site-packages/django/conf/__init__.py", line 45, in _setup

        % (desc, ENVIRONMENT_VARIABLE))

    django.core.exceptions.ImproperlyConfigured: Requested setting STATSD_HOST, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.

    Complete output from command /opt/rh/venv/bin/python -c "import setuptools; __file__='/opt/rh/venv/src/statsd/setup.py'; exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" develop --no-deps:

    Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "/opt/rh/venv/src/statsd/setup.py", line 3, in <module>

    import statsd

  File "statsd/__init__.py", line 23, in <module>

    host = getattr(settings, 'STATSD_HOST', 'localhost')

  File "/opt/rh/venv/local/lib/python2.7/site-packages/django/conf/__init__.py", line 52, in __getattr__

    self._setup(name)

  File "/opt/rh/venv/local/lib/python2.7/site-packages/django/conf/__init__.py", line 45, in _setup

    % (desc, ENVIRONMENT_VARIABLE))

django.core.exceptions.ImproperlyConfigured: Requested setting STATSD_HOST, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.

----------------------------------------

Command /opt/rh/venv/bin/python -c "import setuptools; __file__='/opt/rh/venv/src/statsd/setup.py'; exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" develop --no-deps failed with error code 1 in /opt/rh/venv/src/statsd

Exception information:
Traceback (most recent call last):
  File "/opt/rh/venv/local/lib/python2.7/site-packages/pip-1.2.1-py2.7.egg/pip/basecommand.py", line 107, in main
    status = self.run(options, args)
  File "/opt/rh/venv/local/lib/python2.7/site-packages/pip-1.2.1-py2.7.egg/pip/commands/install.py", line 261, in run
    requirement_set.install(install_options, global_options)
  File "/opt/rh/venv/local/lib/python2.7/site-packages/pip-1.2.1-py2.7.egg/pip/req.py", line 1166, in install
    requirement.install(install_options, global_options)
  File "/opt/rh/venv/local/lib/python2.7/site-packages/pip-1.2.1-py2.7.egg/pip/req.py", line 562, in install
    self.install_editable(install_options, global_options)
  File "/opt/rh/venv/local/lib/python2.7/site-packages/pip-1.2.1-py2.7.egg/pip/req.py", line 652, in install_editable
    show_stdout=False)
  File "/opt/rh/venv/local/lib/python2.7/site-packages/pip-1.2.1-py2.7.egg/pip/util.py", line 612, in call_subprocess
    % (command_desc, proc.returncode, cwd))
InstallationError: Command /opt/rh/venv/bin/python -c "import setuptools; __file__='/opt/rh/venv/src/statsd/setup.py'; exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" develop --no-deps failed with error code 1 in /opt/rh/venv/src/statsd

statsd does nothing whereas shell test works fine

The following works just fine in Ubuntu 14.04:

echo "test.count 10 `date +%s`" | nc -q0 127.0.0.1 2003

whereas the following does nothing:

from statsd import StatsClient
sc = StatsClient(host='localhost', port=2003, prefix='test', maxudpsize=512)
sc.incr('count', 100)

>>> statsd.__version__
'3.0'

graphite-carbon 0.9.12-3
graphite-web 0.9.12+debian-3
python-whisper 0.9.12-1

Negative timer values due to time.time

We're seeing very occasional negative timer values due to NTP shifts and calls of time.time to gettimeofday(), rough background on that here.

perf_counter provides the highest resolution wall time available in Python 3, fixing this for 2 seems a little harder. clock seems to be the logical choice, it is also based off wall time, however it looks as though it suffers resolution problems according to the timeit docs here. Without wanting a resolution regression this bug/edgecase will likely have to stay.

Comparing the performance between time.time and time.perf_counter doesn't show any performance regression.

In [57]: timeit.timeit('perf_counter()', setup='from time import perf_counter', number=10000000)
Out[57]: 2.137723121792078

In [58]: timeit.timeit('time()', setup='from time import time', number=10000000)
Out[58]: 2.1814921628683805

I'll open a PR with a fix, I'm not sure how I'd write a test for this however.

Presence of Django in environment doesn't mean statsd should import Django

Suppose that I have Django in my path because I'm developing a Django app. But I have an external tool, unrelated to Django, which requires pystatsd. If I import statsd, it will in turn import Django and use whatever STATSD_HOST it finds in settings.py, even if that is not the STATSD_HOST I want.

This also creates a dependency on Django. For example, if Django starts throwing a different error to signal missing defines in settings.py, pystatsd would have problems (similar to the Django 1.5 bug) -- even for non-Django applications.

If you want pystatsd to have Django support, my opinion is the correct approach is to create a statsd.django module that grabs STATSD_HOST from Django, and require Django apps to import that. For all other uses, importing statsd should have minimal dependencies.

gauges are not supported in statsd current version

Are gauges supported in the current statsd version embedded in this image?
I tried to use them but I think the data is not being recorded.
Gauges were added to the statsd server in commit 0ed78be.
Do you know if this version of statsd supports them?
If not, how can I update the current version of statsd?
Thanks!

`socket.gaierror` is raised if the host is not known

Curious to know if this is intended behaviour or not.

Example:

>>> import statsd
>>> # use the default, "localhost"
... # there is _not_ a statsd server running on localhost:8125
... # initialises ok,
... sc = statsd.StatsClient()
>>> # this time, specify a random hostname
... # again, no statsd server running at this host
... # initialisation causes error
... sc = statsd.StatsClient(host='foobar')
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "/usr/local/lib/python3.7/site-packages/statsd/client/udp.py", line 35, in __init__
    host, port, fam, socket.SOCK_DGRAM)[0]
  File "/usr/local/lib/python3.7/socket.py", line 748, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

I'm wondering if it would be better to catch-pass this error? On the one hand, it's more in-line with the philosophy that errors with stats should not bring down the application, but on the other, I could imagine wanting to be sure that my application can log anything at all.

Perhaps a middle ground would be some kind of strict=True/False argument, which toggles between the behaviours?

Asyncio support

I hope motivation is clear - I want to send metrics from my async daemon :)

TCP transport implementation looks very lean https://github.com/jsocol/pystatsd/blob/master/statsd/client.py#L157.
So I guess it would not be too hard to implement asyncio support.

Maybe I could even do it on my spare time.
What do you guys think?

Timer decorators on multiple functions send the same time values

This might be a bug, or by design and just my understanding of the docs that's bugged:

I have a script (just a single thread) that is using decorators to apply timers to various functions so that I can see how long is spent in those functions on each run.
However it seems that all timers, despite being different instances with different names, all send the same time value. (On my script, some functions take several seconds to run, yet all statsd timer values were showing up as around 66ms)

For example, a test script to demonstrate this:

import time
import statsd

sd=statsd.StatsClient('statsdhost', 8125, prefix='test')

@sd.timer('wait_1')
@sd.timer('wait_2')

def wait_1():
        time.sleep(1)

def wait_2():
        time.sleep(10)

wait_2()
wait_1()
with sd.timer('test2'):
        wait_2()

sends the following values (captured from tcpdump between the machine the script is running on and the statsd host):

test.wait_2:1001|ms
test.wait_1:1002|ms
test.test2:10010|ms

The values aren't quite the same (1001 vs 1002), so it presumably isn't just using the same timer value, but test.wait_2 clearly isn't ~10,000ms either. Changing the order that wait_1 and wait_2 are called does not change the value that is sent for both functions.

So, is this an issue, or is it my understanding of the code, and you can simply only have one timer running on a decorated function? The docs don't say you can have more than one, but they don't say you can't either!

How to push text data ?

Hi,
Is it possible to push textual data to statsd listeners..

for eg mysql_version : "v8.9"
interface_name: "e1"

Regards,
Sriram

buffering doesn't limit the maximum packet size

https://github.com/jsocol/pystatsd/blob/master/statsd/client.py#L67 has a potential data-loss / corruption issue:

    if (0 < len(self._stats)):
        data = '\n'.join(self._stats)
        self._stats = []
        try:
            self._sock.sendto(data.encode('ascii'), self._addr)

The problem with this is that UDP packets over a certain size may be dropped or fragmented - i.e. I've seen Graphite showing odd metrics because a packet was truncated mid-line and a new hierarchy was created from the truncated of the metric name.

I'm not using pystatsd (yet) so my apologies for not sending a proper pull request. Here's the equivalent fragment of my monitoring code using a dirty hard-coded limit which should really be configurable:

        while data:
            message = ""

            while data and len(message) < 4000:
                message += "%s.%s\n" % (key_prefix, data.pop())

            udp_sock.sendto(message, (host, port))

Add a method to re-run DNS lookup

Hello!

In Kubernetes world, where an IP of a host can change often, catching all socket errors is not very helpful.

For example, if the IP of statsd host is changed, the UDP client does not reconnect to this new IP and the only solution to create a new connection with the new IP of the statsd host is to restart the client.

The line with catches all these errors is

pystatsd/statsd/client/udp.py

Line 45 in 006a863

except (socket.error, RuntimeError):

StatsClient.timer and timeout exception

    def __exit__(self, typ, value, tb):
        dt = time.time() - self.start
        self.ms = int(round(1000 * dt))  # Convert to ms.
        self.client.timing(self.stat, self.ms, self.rate)

I guess this means that in case of a request timeout exception, the timings will be submitted anyway?

Question on additional exception handling in UDP client

Wondering if you would be open to a PR adding additional exception handling, to catch all exceptions while attempting to send UDP stats, instead of just socket and RuntimeErrors? In case, for example, data.encode('ascii') throws an error, but the user doesn't want to crash the application because of it.

E.g. changing this section:

pystatsd/statsd/client.py

Lines 145 to 151 in 151ab6b

 def _send(self, data): 

 """Send data to statsd.""" 

 try: 

 self._sock.sendto(data.encode('ascii'), self._addr) 

 except (socket.error, RuntimeError): 

 # No time for love, Dr. Jones! 

 pass

To this:

def _send(self, data):
    """Send data to statsd."""
    try:
        self._sock.sendto(data.encode('ascii'), self._addr)
    except Exception:
        # No time for love, Dr. Jones!
        pass

If not, this is certainly doable by the user of the lib, just wondering if you would like to add to trunk.

Negative values passed to gauge interpreted as delta

With the current chilly weather we actually got negative Fahrenheit data values and when logged as a gauge we quickly arrived at -75 and plummeting ever faster. Oh wait no, the slightly negative values were interpreted as delta values.

A fix for this is to send a 0 value before sending the negative value if delta is false.

Option to not send timings in case of an exception

The timing decorator is a very concise way of logging timing information for a block of code. We use these timings as the input for SLIs in our platform. When an exception takes places inside a timing context manager, the timing is always send. In our situation it would be preferable to NOT send a timing in case of an Exception.

Would you be open for a PR that adds an extra optional argument to the Timer class that disables sending timings in case of an exception? For example: log_on_exception=True? The reason I'm asking is because it seems like a strange argument to have if you use the Timer class without the context manager.

Or, if the above is not something you'd want, perhaps you might have other ideas for supporting this? I'm open for doing the implementation!

timer used as a decorator can lead to runtime errors "Already sent data"

I think there should be a warning about using timers as decorators in http://statsd.readthedocs.org/en/latest/timing.html#using-a-decorator.

For example, if you use a timer decorator on a Flask view, you risk to run into a RuntimeError("Already sent data.") if you have concurrent requests. This is because the decorator is interpreted only once and so it will call enter and exit methods on the same Timer object for different requests.

To avoid this, you should simply use a context manager inside of the function:

def foo():
    with sc.timer("foo"):
        bar()

instead of

@sc.timer("foo")
def foo():
    bar()

Here is a test function that shows the issue:

from time import sleep
from random import random
from threading import Thread
def test_timer_decorator_concurrent():
    sc = _client()

    @sc.timer('bar')
    def bar():
        sleep(random())

    nb_threads = 10
    threads = []
    for i in xrange(nb_threads):
        t = Thread(target=bar)
        threads.append(t)
        t.start()
    for t in threads:
        t.join()

Version 3.3.0 breaks backward compatibility

Hi,
we use typings in our projects. And because we want to run our projects on variety of clients according to actual demands (for example UnixSocketStatsClient or TCPStatsClient or StatsClient) we use supertype statsd.client.StatsClientBase for typings. Unfortunatelly the StatsClientBase has been moved to statsd.client.base so the code is broken now. We know it is undocumented API, but it is public class.

I created the pull request for it: #121

Best regards

Timer cannot decorate a partial function

How to reproduce:

>>> import functools
>>> import time
>>> import statsd
>>> client = statsd.StatsClient()
>>> sleep5 = functools.partial(time.sleep, 5)
>>> client.timer('mystat')(sleep5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build/bdist.linux-x86_64/egg/statsd/client.py", line 26, in __call__
  File "/usr/lib/python2.7/functools.py", line 33, in update_wrapper
    setattr(wrapper, attr, getattr(wrapped, attr))
AttributeError: 'functools.partial' object has no attribute '__module__'

There seems to be several approaches to fix it, like for example using a safe version of functools.wraps. What do you think?
I would be happy to contribute a fix!

Allow lazy loading or re-configuring StatsClient

Our WSGI application is created something like this:

def build_app(conf_dir):
    ...
    return app

Here, I cannot define the StatsClient inside build_app as it wouldn't be importable for decorators then.

So I'm doing something like this:

class ConfigurableStatsClient(StatsClient):
    """StatsClient extension that allows re-configuration after initialization.

    Implemented because StatsClient does not allow lazy loading.

    """
    def __init__(self, host='localhost', port=8125, prefix=None, maxudpsize=512, enabled=True):
        """Create a new client."""
        super(ConfigurableStatsClient, self).__init__(host, port, prefix, maxudpsize)
        self._enabled = enabled

    def reload(self, host='localhost', port=8125, prefix=None, maxudpsize=512, enabled=True):
        """Reloads client with newly given parameters."""
        self.__init__(host=host, port=port, prefix=prefix, maxudpsize=maxudpsize, enabled=enabled)

    def _send(self, data):
        """Send data to statsd only if client is enabled."""
        if not self._enabled:
            return
        super(ConfigurableStatsClient, self)._send(data)

STATSD = ConfigurableStatsClient(enabled=False)

def build_app(conf_dir):

    STATSD.reload(..., enabled=True)

    return app

Actually, ConfigurableStatsClient provides to features: 1) allow re-configuring later on, and 2) don't send packets if not enabled.

I can make a PR out of this and split it into two, if it makes sense.

Or, would there be some already supported way for doing this?

can't specify address family, may use the wrong one

Hi,

latest release come with default ipv6 support. I think it should be an option not set by default because e.g. etsy statsd has ipv6 support disabled by default and updating pystatsd can easily broke metrics collecting and finding cause will be not so obvious (it happened to me :)).

Can't import if django in $PYTHONPATH, but not a django application

I have a python application I want to send stats from. Django is installed in the $PYTHONPATH, so

from django.conf import settings

works and settings is defined. [as in https://github.com/jsocol/pystatsd/blob/e81fb4dea2077f6e9e873d07db831bbf931df174/statsd/init.py#L6]

However, if I import this from my app, this happens:

  File "/home/vagrant/workspace/myproject/src/myproject/__init__.py", line 13, in <module>
    from statsd import StatsClient
  File "/home/vagrant/workspace/myproject/lib/python2.7/site-packages/statsd-2.0.1-py2.7.egg/statsd/__init__.py", line 20, in <module>
    host = getattr(settings, 'STATSD_HOST', 'localhost')
  File "/usr/local/lib/python2.7/dist-packages/django/conf/__init__.py", line 52, in __getattr__
    self._setup(name)
  File "/usr/local/lib/python2.7/dist-packages/django/conf/__init__.py", line 45, in _setup
    % (desc, ENVIRONMENT_VARIABLE))
django.core.exceptions.ImproperlyConfigured: Requested setting STATSD_HOST, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.

HTH,

Support for AF_UNIX sockets?

Hi there!

I'm a big fan of this library and have used it extensively! I have a use case where I need to emit from many services on a box to a single statsd daemon (also on the box) that is listening on an AF_UNIX socket. To accomplish this I've subclassed the StatsClientBase in this library and added support for emitting to an AF_UNIX socket.

I'm happy to submit a patch based on the subclass I mention, but I wanted to ask first if you would be interested in one. And if so, would you prefer it be "automatic" e.g. in my subclass if the port parameter is None we assume that the address is AF_UNIX. To me, that seems a little magical compared to the pystatsd API which is very explicit (and nice!). If so, would you prefer it be it's own StatsClientBase subclass like StatsClientLocal or something like that?

multiple-metrics packets with udp client?

hi,
I was looking into having statsd send multiple packets via udp similarly to what tcp does, what do you think? statsd "specs" seem to suggest that multi-metric packets don't differentiate between udp and tcp

3.2.2 is not published yet

Hi,

I was reading CHANGES file, and it shows reference to 3.2.2 version, which was done in 7c5ccf1, but this is not published yet in pypi.

Could you push it to use these version instead of use github+sha reference?

Thanks!

How to map api key of hostedgrahite server

HI James

I am using pystatsd to send data on hostedgraphite, i have account there. But i am wondering how to use api key with this library.

client = statsd.StatsClient('statsd.hostedgraphite.com', 8125, prefix='staging.workers')
client.gauge('queueSize', 5);

How hosted graphite will map this data to my account. :)

Best,
Anurag

Backpoff mode for TCP client

Right now when StatsD is down TCPStatsClient will try to establish TCP connection while handling each metric which is definitely unnecessary when it goes to use of resources and network saturation.

Idea is to either implement simple logic like wait 1 second between each re-try or make it pluggable so would be easy to implement f.ex. exponential backoff.

VERSION in init seems to be incorrect

Howdy, like the title says, looks like there was a spot missed in the 3.3 release (still claims to be 3.2.1)

pystatsd/statsd/__init__.py

Line 8 in 1c90b9f

VERSION = (3, 2, 1)

Noop client if a connection is not available

My could sometimes needs to run in scenarios where a statsd host is not available. I don't want to wrap every call to statsd into a condition.

statsd seems to be behave well if just using an empty string as the host, but I can't help but worry that this is the wrong thing performance wise.

Is it a bad idea to just fire-and-forget and not worry about a statsd server being there? What is the right way?

Permission to add statsd to typeshed?

Howdy

I'd like to add stubs for statsd to https://github.com/python/typeshed, a repository used to store PEP 484 type signatures for the python stdlib and popular third party libraries. As per PEP 484, permission is required from the library owner before merging type signatures into typeshed - https://www.python.org/dev/peps/pep-0484/#the-typeshed-repo. Would you mind if I contributed those stubs?

Drop IPv6

So, I went ahead and added IPv6 support to the host lookup without actually noticing that the StatsD server doesn't bind to IPv6 interfaces. It uses Node's dgram.createSocket('udp4'). And, at least on most systems, Python's socket.getaddrinfo() will return an IPv6 address for 'localhost', 8125, first. D'oh.

So, just gotta restrict socket.getaddrinfo() to IPv4, and might as well limit the search down to UDP, etc. I've got a patch, just creating a reminder.

Restore metaclass functionality on all python versions

Hi,
see

pystatsd/statsd/client.py

Line 84 in ace112e

__metaclass__ = abc.ABCMeta

Unfortunately __metaclass__ has no effect in Python 3. The correct way in Python3 is: class StatsClientBase(metaclass=abc.ABCMeta):

The best way how to keep compatibility with the Python 2 is to use six or future.utils. For only this particular purpose I recommend the future.utils. See http://python-future.org/_modules/future/utils.html#with_metaclass

	start_time = time_now()
	try:
	return f(args, *kwargs)
	finally:
	elapsed_time_ms = 1000.0 * (time_now() - start_time)
	self.client.timing(self.stat, elapsed_time_ms, self.rate)

	def _send(self, data):
	"""Send data to statsd."""
	try:
	self._sock.sendto(data.encode('ascii'), self._addr)
	except (socket.error, RuntimeError):
	# No time for love, Dr. Jones!
	pass