Git Product home page Git Product logo

carbon's Introduction

Carbon

Build Status

Carbon is one of the components of Graphite, and is responsible for receiving metrics over the network and writing them down to disk using a storage backend. Currently Whisper is our stable, supported backend and Ceres is the work-in-progress future replacement for Whisper.

Overview

Client applications can connect to the running carbon-cache.py daemon on port 2003 (default) and send it lines of text of the following format:

my.metric.name value unix_timestamp

For example:

performance.servers.www01.cpuUsage 42.5 1208815315
  • The metric name is like a filesystem path that uses a dot as a separator instead of a forward-slash.

  • The value is some scalar integer or floating point value

  • The unix_timestamp is unix epoch time, as an integer.

Each line like this corresponds to one data point for one metric.

Alternatively, they can send pickle-formatted messages to port 2004 (default) which is considered faster than the line-based format.

Once you've got some clients sending data to carbon-cache, you can view graphs of that data through the frontend Graphite Web application.

Running carbon-cache.py

First you must tell carbon-cache what user it should run as. This must be a user with write privileges to $GRAPHITE_ROOT/storage/whisper. Specify the user account in $GRAPHITE_ROOT/conf/carbon.conf. This user must also have write privileges to $GRAPHITE_ROOT/storage/log/carbon-cache

Alternatively, you can run carbon-cache/carbon-relay/carbon-aggregator as Twistd plugins, for example:

Usage: twistd [options] carbon-cache [options]
Options:
      --debug       Run in debug mode.
  -c, --config=     Use the given config file.
      --instance=   Manage a specific carbon instance. [default: a]
      --logdir=     Write logs to the given directory.
      --whitelist=  List of metric patterns to allow.
      --blacklist=  List of metric patterns to disallow.
      --version     Display Twisted version and exit.
      --help        Display this help and exit.

Common options to twistd(1), like --pidfile, --logfile, --uid, --gid, --syslog and --prefix are fully supported and have precedence over carbon-*'s own options. Please refer to twistd --help for the full list of supported twistd options.

Writing a client

First you obviously need to decide what data it is you want to graph with graphite. The script examples/example-client.py demonstrates a simple client that sends loadavg data for your local machine to carbon on a minutely basis.

The default storage schema stores data in one-minute intervals for 2 hours. This is probably not what you want so you should create a custom storage schema according to the docs on the Graphite wiki.

carbon's People

Contributors

bmhatfield avatar cdavis avatar gumuz avatar jblaine avatar jdanbrown avatar jerith avatar josip avatar mleinart avatar nleskiw avatar sidnei avatar tmm1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

prokod

carbon's Issues

Removing a queue file from underneath the relay can cause it to get very upset

I cleaned up files from the spool's temp/* directories which were zero-length. That's not the safest thing to do in general, but in this case it seemed to cause a weird class of problem:

2013-09-05T22:36:28.87664 05/09/2013 22:36:28 :: [clients] fname is 1378419989.91.000 and new_name is /mnt/gra
phite/spool/carbon/send/ec2-54-242-197-216.compute-1.amazonaws.com:2004/1378419989.91.000
2013-09-05T22:36:28.87683 05/09/2013 22:36:28 :: [console] Unhandled Error
2013-09-05T22:36:28.87684 Traceback (most recent call last):
2013-09-05T22:36:28.87684   File "/usr/local/lib/python2.7/dist-packages/twisted/application/app.py", line 402
, in startReactor
2013-09-05T22:36:28.87685     self.config, oldstdout, oldstderr, self.profiler, reactor)
2013-09-05T22:36:28.87685   File "/usr/local/lib/python2.7/dist-packages/twisted/application/app.py", line 323
, in runReactorWithLogging
2013-09-05T22:36:28.87686     reactor.run()
2013-09-05T22:36:28.87687   File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 1192,
 in run
2013-09-05T22:36:28.87687     self.mainLoop()
2013-09-05T22:36:28.87688   File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 1201, in mainLoop
2013-09-05T22:36:28.87688     self.runUntilCurrent()
2013-09-05T22:36:28.87689 --- <exception caught here> ---
2013-09-05T22:36:28.87689   File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 824, in runUntilCurrent
2013-09-05T22:36:28.87690     call.func(*call.args, **call.kw)
2013-09-05T22:36:28.87690   File "/opt/graphite/lib/carbon/client.py", line 219, in sendQueued
2013-09-05T22:36:28.87691     self.open_next_queue_file_list()
2013-09-05T22:36:28.87692   File "/opt/graphite/lib/carbon/client.py", line 115, in open_next_queue_file_list
2013-09-05T22:36:28.87692     os.rename(qf[1], new_name)
2013-09-05T22:36:28.87693 exceptions.OSError: [Errno 2] No such file or directory
2013-09-05T22:36:28.87694 
2013-09-05T22:36:56.63736 removed `/opt/graphite/storage/carbon-relay-a.pid'

Zero-length files aren't being removed from the send queue. They probably shouldn't get there in the first place.

Watch is showing me this in my test vagrant:

Every 2.0s: ls -lRsh /var/spool/carbon/send/ec2-50-16-2-85.compute-1.amazonaws.com:2004 /var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazon...  Tue Jun 25 19:01:52 2013

/var/spool/carbon/send/ec2-50-16-2-85.compute-1.amazonaws.com:2004:
total 0   

/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:2004:
total 0   

/var/spool/carbon/temp/ec2-50-16-2-85.compute-1.amazonaws.com:2004:
total 0   
0 -rw-r--r-- 1 www-data www-data 0 Jun 25 18:33 1372185211.47
0 -rw-r--r-- 1 www-data www-data 0 Jun 25 18:48 1372186111.47
0 -rw-r--r-- 1 www-data www-data 0 Jun 25 19:01 1372186891.42

/var/spool/carbon/temp/ec2-54-235-34-178.compute-1.amazonaws.com:2004:
total 0   
0 -rw-r--r-- 1 www-data www-data 0 Jun 25 18:29 1372184971.47
0 -rw-r--r-- 1 www-data www-data 0 Jun 25 18:45 1372185931.47
0 -rw-r--r-- 1 www-data www-data 0 Jun 25 19:00 1372186831.47
0 -rw-r--r-- 1 www-data www-data 0 Jun 25 19:01 1372186891.42

Abandoned child netcat processes will sit around

The nc process that is a child to the repr-pickle-sender will get left hanging out.

Using netcat's "-w " should cover this as a safety measure (maybe? maybe it's not the right thing to do here). This is most likely coming from the use of the subprocess module and the need to do p.stdin.write() to avoid the inability of the subprocess.Popen.communicate() to act as a pipe, so without replacing subprocess with os.fork, fixing this with nc may be the easist solution.

If the send queue dir is moved, it can cause an exception. This should be seamless

To recover from an excessively large queue dir, this should be handled cleanly.

Current

2013-11-25T19:58:28.03246 25/11/2013 19:58:28 :: [console] Unhandled Error
2013-11-25T19:58:28.03249 Traceback (most recent call last):
2013-11-25T19:58:28.03250   File "/usr/local/lib/python2.7/dist-packages/twisted/application/app.py", line 402, in startReactor
2013-11-25T19:58:28.03251     self.config, oldstdout, oldstderr, self.profiler, reactor)
2013-11-25T19:58:28.03252   File "/usr/local/lib/python2.7/dist-packages/twisted/application/app.py", line 323, in runReactorWithLogging
2013-11-25T19:58:28.03253     reactor.run()
2013-11-25T19:58:28.03253   File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 1192, in run
2013-11-25T19:58:28.03254     self.mainLoop()
2013-11-25T19:58:28.03254   File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 1201, in mainLoop
2013-11-25T19:58:28.03255     self.runUntilCurrent()
2013-11-25T19:58:28.03256 --- <exception caught here> ---
2013-11-25T19:58:28.03256   File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 824, in runUntilCurrent
2013-11-25T19:58:28.03257     call.func(*call.args, **call.kw)
2013-11-25T19:58:28.03257   File "/opt/graphite/lib/carbon/client.py", line 101, in sendQueued
2013-11-25T19:58:28.03258     self._sendDatapoints(self.factory.takeSomeFromQueue())
2013-11-25T19:58:28.03258   File "/opt/graphite/lib/carbon/client.py", line 70, in _sendDatapoints
2013-11-25T19:58:28.03259     self.factory.queue_file.write(json.dumps(datapoints) + "\n")
2013-11-25T19:58:28.03261 exceptions.ValueError: I/O operation on closed file

generate-repr-flood needs to create a directory hierarchy

The current pattern that's used for creating metrics creates everything under test/. That's impossible to navigate via any web ui.

Break up the metric name with a "." every 2 characters, e.g. "test.ab.cd.ef" instead of "test.abcdef".

Support a way to defer moving a temp file when we need to shuffle directories

Since sometimes directories grow, and moving them is necessary, moving a directory should be made safe. I'd like to have something like

sudo touch destination-defer && sudo mv destination destination-old && sudo mkdir destination && sudo chown user:group destination && sudo rm destination-defer

This would be to provide something like a 10 second window - these operations should never take very long on local disk. The defer, if left in place by accident, should elicit warnings in the log, but shouldn't prevent operation 10 seconds after it's created based on what I'm thinking right now.

If the destination is down, the number of children will be exceeded

So things are running, then are interrupted:

2013-06-27T14:44:36.88523 INFO: queue dir has 33 waiting items and 10 active processes
2013-06-27T14:44:37.99145 INFO: At Thu Jun 27 14:44:37 2013 items contains: [(28257, ('1372344228.92', 1372344230.843907, 14)), (28227, ('1372344218.32', 1372344218.544696, 8
)), (28250, ('1372344223.83', 1372344225.410363, 12)), (28246, ('1372344221.71', 1372344222.888793, 11)), (28238, ('1372344219.53', 1372344219.658196, 9)), (28253, ('13723442
26.31', 1372344227.926234, 13)), (28242, ('1372344220.63', 1372344220.77751, 10)), (28214, ('1372344214.32', 1372344215.184942, 5)), (28218, ('1372344216.12', 1372344216.2020
53, 6)), (28221, ('1372344217.18', 1372344217.323327, 7))]
2013-06-27T14:44:39.02243 Traceback (most recent call last):
2013-06-27T14:44:39.02246   File "/vagrant/graphite/bin/repr-pickle-sender.py", line 60, in <module>
2013-06-27T14:44:39.02346     sub_p.stdin.write(struct.pack(struct_format, len(p)) + p)
2013-06-27T14:44:39.02369 IOError: [Errno 32] Broken pipe
2013-06-27T14:44:39.09648 INFO: queue dir has 34 waiting items and 10 active processes

Then when the first failures start, the more aggressive code I added to help parallel sends get going seems to be doing something funny:

2013-06-27T14:44:39.09662 Pid: 28214 failed with an exit code of 256
2013-06-27T14:44:39.76843 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344214.32']
2013-06-27T14:44:39.76846 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344231.80']
2013-06-27T14:44:39.76849 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344234.04']
2013-06-27T14:44:39.76850 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344236.42']
2013-06-27T14:44:39.76851 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344238.40']
2013-06-27T14:44:39.76852 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344240.60']
2013-06-27T14:44:39.76853 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344242.53']
2013-06-27T14:44:39.76856 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344244.55']
2013-06-27T14:44:39.76857 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344246.74']
2013-06-27T14:44:39.76858 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344248.75']
2013-06-27T14:44:39.76859 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344250.53']
2013-06-27T14:44:39.76860 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344252.63']
2013-06-27T14:44:39.76861 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344254.79']
2013-06-27T14:44:39.76864 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344256.74']
2013-06-27T14:44:39.76865 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344258.47']
2013-06-27T14:44:39.76866 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344260.49']
2013-06-27T14:44:39.76867 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344262.37']
2013-06-27T14:44:39.76868 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:200
4/1372344264.39']
2013-06-27T14:44:39.76869 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:2004/1372344266.27']
2013-06-27T14:44:39.76871 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:2004/1372344268.36']
2013-06-27T14:44:39.76872 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:2004/1372344270.12']
2013-06-27T14:44:39.76873 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:2004/1372344272.22']
2013-06-27T14:44:39.76874 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:2004/1372344274.07']
2013-06-27T14:44:39.76876 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:2004/1372344276.22']
2013-06-27T14:44:39.76877 ['repr-pickle-sender.py', 'ec2-54-235-34-178.compute-1.amazonaws.com', '2004', '/var/spool/carbon/send/ec2-54-235-34-178.compute-1.amazonaws.com:2004/1372344277.84']
2013-06-27T14:44:39.92200 INFO: 34 active children, limit is 10. Passing.

Whoops. Make sure the limit isn't exceeded.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.