Git Product home page Git Product logo

cloudprober's Introduction

cloudprober's People

Contributors

aidarbek avatar ali-sattari avatar amdw avatar amirrot avatar andradaaag avatar bekriebel avatar cbroglie avatar chemidy avatar clivern avatar clyde-xu avatar cochva avatar dazwilkin avatar dhananjaysathe avatar dsnet avatar eliblight avatar evgenii-petrov-arrival avatar guodongli-google avatar itamarkam avatar jeis2497052 avatar kant avatar katee avatar kohend avatar liehendi11 avatar ls692 avatar manugarg avatar mgarolera avatar robinmccorkell avatar robpickerill avatar ryrose avatar tbuchier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cloudprober's Issues

Cloudprober should export its runtime stats

To help debugging any performance issues, cloudprober should export its runtime stats:

  1. No of gorotuines (runtime.NumGoroutine()) - gauge
  2. Fraction of time spent in GC (runtime.ReadMemStats - reflect.MemStats.GCCPUFraction) - counter
  3. Number of allocs and frees (runtime.Mallocs, runtime.Frees) - counter
  4. Overall CPU and memory usage of cloudprober. Memory usage can be obtained from runtime.MemStats.Sys (gauge). CPU usage will require us to parse /proc/self/stat and it will be Linux specific.

We can extend sysvar module and add another file that includes utilities to do all this.

Cannot capture string values

Cloudprober is not able to capture the string values outputted by a script while running the script through cloudprober. Cloudprober at the moment can capture numerical values, an enhancement is needed to capture string values.

Jitter to avoid thundering herd?

Is there any jitter configuration that we can use to avoid large amount of probes going to the same target simultaneously? Context: launching many cloudprober instances at the same time on various locations to probe the same set of targets.

Introduce the concept of shared targets

We could do it in two ways:
a) Define targets independently of the probes and assign each one of them a unique id. Then in the probe definition refer to that id.

targets {
  id: "vm-private-ip"
  gce_targets {
  }
}

probe {
  name: "vm-to-vm"
  type: PING
  targets_id: "vm-private-ip"
  ping_probe {
  }
}

b) Introduce the concept of probeset where a probeset is a set of probes that use the same targets.

probeset {
  targets {
    ..
  }
  probe {}
  probe {}
}

Need few explanation on cloudprober

Hi,

I see a metric named latency. just wanted to know if that latency metric is actually - latency of the cloudprober or from the network.

Also after adding/updating the URL in the cloudprober.cfg, is there anyway to reload it on the run-time. Every-time after making the changes in the file, i am re-starting the container.

Can someone advise ?

Thanks

Is em.Clone redundant?

// Start starts the probe and writes back the data on the provided channel.
// Probe should have been initialized with Init() before calling Start on it.
func (p *Probe) Start(ctx context.Context, dataChan chan *metrics.EventMetrics) {
if p.conn == nil {
p.l.Critical("Probe has not been properly initialized yet.")
}
defer p.conn.close()
for ts := range time.Tick(p.opts.Interval) {
// Don't run another probe if context is canceled already.
select {
case <-ctx.Done():
return
default:
}

	p.runProbe()
	p.l.Debugf("%s: Probe finished.", p.name)
	if (p.runCnt % uint64(p.c.GetStatsExportInterval())) != 0 {
		continue
	}
	for _, t := range p.targets {
		em := metrics.NewEventMetrics(ts).
			AddMetric("total", metrics.NewInt(p.sent[t])).
			AddMetric("success", metrics.NewInt(p.received[t])).
			AddMetric("latency", metrics.NewFloat(p.latency[t].Seconds()/p.opts.LatencyUnit.Seconds())).
			AddLabel("ptype", "ping").
			AddLabel("probe", p.name).
			AddLabel("dst", t)

		dataChan <- em.Clone()
		p.l.Info(em.String())
	}
}

}

For dataChan <- em.Clone(), is this line of code redundant? The operation of the Clone function is to apply for a new memory, for the operation of the memory.

HTTP probe headers value

Is it possible to specify headers value in the cfg file ?

 http_probe {
      protocol: HTTPS
      method: GET
      relative_url: "/"
      headers: {
        name: "Content-Type"
        value: "application/json"
}

Should prometheus surfacer output delete timestamp?

cloudprober write data in prometheus surfacer end with timstamps, but in some cases, prometheus can't accept, error like:
msg="Error on ingesting samples that are too old or are too far into the future"

like Ben Kochie said in https://groups.google.com/d/msg/prometheus-users/NvLGxFZczJk/8hlYoAEsCwAJ

It's recommended to omit the timestamp. The timestamp in the spec is there to allow for federation or importing of data from external TDSBs

Should we delete the timestamp?

FR: Add (useful) "/config" endpoint to documentation

Hunting around the code for a solution to an issue, I discovered the "/config" endpoint.

This is a useful endpoint for confirming cloudprober's current configuration.

Please consider including it (I think it's not currently) in the documentation.

[Help] How to use Probe specific Configs?

In the docs only the code is linked, and since I'm not familiar with protobuff / go it didn't really help me.. my naiv approach was sth like this:

probe {
  name: "cloudprober"
  type: HTTP
  targets {
    host_names: "google.de"
  }
  interval_msec: 5000  # 5s
  timeout_msec: 1000   # 1s
  relative_url: "/"
}

but that already doesnt work, the config files also don't have a property for sth like "AbstractSpecialProbeConfig" and / or there is no inheritance, so where should I put the property? (maybe sth. for the docs?)

also adding more then 1 HTTP Probe only worked via host_names: "host1.de, host2.de" which was also not very clear for me (I guess it only matches against the first probe of a type it finds?)

Extend HTTP probe type to support more complex probes

It would be useful if the HTTP probe was extended to supported the following:

  • All HTTP method types (GET, POST, PUT, HEAD, DELETE, PATCH, OPTIONS).
  • Request bodies.
  • Arbitrary headers.

Being able to customize HTTP requests in these ways would allow for a more comprehensive probe into a system in line with normal user behavior.

Propose: Kubernetes Deployment

Assuming #88, it's possible to deploy cloudprober to Kubernetes and reference a custom config through Kubernetes' ConfigMap:

Assuming cloudconfig.google.cfg, per the documentation example, this can be uploaded to Kubernetes as a configuration file accessible as cloudconfig.cfg:

kubectl create configmap cloudprober-config \
--from-file=cloudprober.cfg=cloudprober.google.cfg \
--namespace=default

or:

apiVersion: v1
data:
  cloudprober.cfg: |
    probe {
      name: "google"
      type: HTTP
      targets {
        host_names: "www.google.com"
      }
      interval_msec: 15000  # 15s
      timeout_msec: 1000   # 1s
    }
kind: ConfigMap
metadata:
  name: cloudprober-config
  namespace: default

NB Rename "cloudprober.cfg" in both as appropriate.

and:

kubectl describe configmaps/cloudprober-config \
--namespace=default

yields:

Name:         cloudprober-config
Namespace:    default
Labels:       <none>
Annotations:  <none>

Data
====
cloudprober.cfg:
----
probe {
  name: "google"
  type: HTTP
  targets {
    host_names: "www.google.com"
  }
  interval_msec: 15000  # 15s
  timeout_msec: 1000   # 1s
}

Then:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: cloudprober
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: cloudprober
    spec:
      volumes:
      - name: cloudprober-config
        configMap:
          name: cloudprober-config
      containers:
      - name: cloudprober
        image: cloudprober/cloudprober
        args: [
          "--config_file","/cfg/cloudprober.cfg",
          "--logtostderr"
        ]
        volumeMounts:
        - name: cloudprober-config
          mountPath: /cfg
        ports:
        - name: http
          containerPort: 9313
        env:
        - name: CLOUDPROBER_PORT
          value: "9313"
---
apiVersion: v1
kind: Service
metadata:
  name: cloudprober
  labels:
    app: cloudprober
spec:
  ports:
  - port: 9313
    protocol: TCP
    targetPort: 9313
  selector:
    app: cloudprober
  type: NodePort

For some reason, under Kubernetes, cloudprober balks if CLOUDPROBER_PORT is not set.

Postgres Surfacer

Currently, I am auditing using cloudprober with prometheus and postgres remote endpoint. It would simplify things drastically to be able to write to postgres directly from cloudprober. I would be happy to contribute this if it's something that could be in cloudprober.

Error download and run the latest docker image

Hi all, I am a newbie. Help is appreciated. Thanks.

$ docker run --net host cloudprober/cloudprober
I0419 23:19:54.254899 1 cloudprober.go:130] Error reading config from metadata. Err: metadata: GCE metadata "project/attributes/cloudprober_config" not defined
W0419 23:19:54.255019 1 cloudprober.go:139] Config file /etc/cloudprober.cfg not found. Using default config.
I0419 23:19:54.306354 1 sysvars_gce.go:63] No instance_template found. Defaulting to undefined.
F0419 23:19:54.316272 1 cloudprober.go:178] Error initializing cloudprober. Err: error while creating listener for default HTTP server. Err: listen tcp :9313: bind: addr
ess already in use

[feature] --reload

Would be nice to have reload command to re-read configuration instead of restart main process.

Propose: Dockerfile w/ ENTRYPOINT & CMD

Please consider revising project's Dockerfile to use ENTRYPOINT (to start cloudprober) and CMD to default the flags to --logtostderr like so:

...
ENTRYPOINT ["/cloudprober"]
CMD ["--logtostderr"]

Currently, the container does not permit e.g. arbitrary config files:

docker run \
--publish=9313:9313 \
--volume=$PWD/cloudprober.google.cfg:/cloudprober.cfg cloudprober/cloudprober \
--config_file /cloudprober.cfg \
--logtostderr

produces:

docker: Error response from daemon: oci runtime error: container_linux.go:265: starting container process caused "exec: \"--config_file\": executable file not found in $PATH".
ERRO[0000] error waiting for container: context canceled 

With the change, the docker command works.

This change helps e.g. run cloudprober on Kubernetes

Thoughts: Synthetic Transaction Support?

My current org (and many previous places i've been a part of) have a need for a tool which can be used to drive synthetic transactions by applying an application level input on a schedule (much like registering a probe :p). This service could measure latency/success of message processing (with a timeout). In my head this service is different from a probe as it would potentially need to support many different application level protocols ie (send a message over cloud pub/sub, kakfa, nsq, sqs, grpc, http/json, etc)

I was imagining that the input would contain a unique id for each synthetic transaction and when the target service is finished processing it would make an RPC call with the transaction status back to this service:

The reason I'm wondering about this in cloudprober is because cloudprober has all the pieces for something like this:

  • config system
  • scheduling system
  • surfacer architecture
  • probe architecture

In order for something like above I imagine it would require some additional features like:

  • stateful storage of active synthetic transactions
  • grpc service for clients to modify synthetic transaction states (ie success/failure/completion)

@manugarg I would love to hear your thoughts on synthetic transactions and if cloudprober might be a home for features like this?

How to setup a ping probe in cloudprober.cfg

Hi,
@manugarg
We are trying to replace the existing BB exporter with cloudprober now. We were able to test the DNS and HTTP probes.
But when we try to test PING probe, it is not working as expected. Can you please assist/tell if something is wrong with my cloudprober.cfg

probe {
name: "test_ping_job"
type: PING
targets {
gce_targets {
instances {}
}
}
run_on: "{{"123456.bnymellon.com"}}"
interval_msec: 12000 # 5s
timeout_msec: 6000 # 5s
}

[Bug] Cloudprober stops working

We are using Cloudprober to ping ~20 hosts currently. From time to time it stops working, without crashing the container. The HTTP Endpoint still works, but there are no new results generated.

Unable to successfully run a probe in server mode

It appears that the code is not successfully binding stdin (stdout, stderr) to Cloudprober.

I added some debugging and in external.go sendRequest p.cmdStdin is <nil>

&{redis_probe server ./redis_probe [] 0xc4200b8a50 0xc42016cf60 0xc4200acdd0 0 map[] 1 false <nil> <nil> <nil> 0xc4200aa4e0 map[] map[:1] map[] 0xc4202bea00 map[] {0 0}}

The only place I have found where p.cmdStdin is assigned is startCmdIfNotRunning but I added a deubgging statement to this function and it appears not to be called. Additionally Cloudprober reports Creating a EXTERNAL probe: redis_probe but startCmdIfNotRunning would say Starting external command. The former appears generated by:

l.Infof("Creating a %s probe: %s", p.GetType(), p.GetName())

I tried using the redis example:

go run main.go
2018/04/04 16:04:00 hello=world
set_latency_ms 0.629330
get_latency_ms 0.474291

And after building it:

./redis-probe
2018/04/04 16:04:18 hello=world
set_latency_ms 0.889990
get_latency_ms 0.668944

And, if I set the cloudprober.cfg to mode: ONCE it works:

cloudprober --config_file=./cloudprober.cfg --logtostderr --debug_log
Error while getting default lameduck lister, lameduck behavior will be disabled. Err: global lameduck service not initialized
Creating a EXTERNAL probe: redis-probe
Initialized prometheus exporter at the URL: /metrics
Running external command: ./redis-probe 
1522883241 labels=ptype=external,probe=redis-probe,dst= success=1 total=1 latency=4325.327
1522883241 labels=ptype=external,probe=redis-probe,dst= set_latency_ms=1.032 get_latency_ms=0.83
cloudprober 1522883239827938228 1522883241 labels=ptype=external,probe=redis-probe,dst= success=1 total=1 latency=4325.327
cloudprober 1522883239827938229 1522883241 labels=ptype=external,probe=redis-probe,dst= set_latency_ms=1.032 get_latency_ms=0.83
Checking validity of new label: ptype
Checking validity of new label: probe
Checking validity of new label: dst
Checking validity of new metric: success
Checking validity of new metric: total
Checking validity of new metric: latency
Checking validity of new metric: set_latency_ms
Checking validity of new metric: get_latency_ms

BUT:

cloudprober --config_file=./cloudprober.cfg --logtostderr --debug_log
Error while getting default lameduck lister, lameduck behavior will be disabled. Err: global lameduck service not initialized
Creating a EXTERNAL probe: redis-probe
Initialized prometheus exporter at the URL: /metrics
Sending a probe request 1 to the external probe server for target 
2018/04/04 16:09:19 [dazwilkin:sendRequest] &{redis-probe server ./redis-probe [] 0xc420188780 0xc420195440 0xc4200ac450 0 map[] 1 false <nil> <nil> <nil> 0xc4200aa5a0 map[] map[:1] map[] 0xc42018e980 map[] {0 0}}
2018/04/04 16:09:19 [dazwilkin:sendRequest] <nil>
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4c5077]

goroutine 47 [running]:
fmt.Fprintf(0x0, 0x0, 0xd21aa0, 0x17, 0xc42007bcd8, 0x2, 0x2, 0x82ce045d, 0xbea974a7e370a659, 0x1d)
	/usr/lib/google-golang/src/fmt/print.go:189 +0x77
github.com/google/cloudprober/probes/external/serverutils.WriteMessage(0xd99d20, 0xc420184940, 0x0, 0x0, 0x1, 0x2)
	/usr/local/google/home/dazwilkin/Projects/dazwilkin-180330-cloudprober/go/src/github.com/google/cloudprober/probes/external/serverutils/serverutils.go:96 +0x1c9
github.com/google/cloudprober/probes/external.(*Probe).sendRequest(0xc42030c000, 0xc400000001, 0xd13950, 0x0, 0xc4200b14d8, 0xc4200af770)
	/usr/local/google/home/dazwilkin/Projects/dazwilkin-180330-cloudprober/go/src/github.com/google/cloudprober/probes/external/external.go:328 +0x641
github.com/google/cloudprober/probes/external.(*Probe).runServerProbe(0xc42030c000, 0xd9bce0, 0xc420195ce0, 0xc420195980)
	/usr/local/google/home/dazwilkin/Projects/dazwilkin-180330-cloudprober/go/src/github.com/google/cloudprober/probes/external/external.go:403 +0x199
github.com/google/cloudprober/probes/external.(*Probe).runProbe(0xc42030c000, 0xd9bca0, 0xc4200ce000, 0xc420195980)
	/usr/local/google/home/dazwilkin/Projects/dazwilkin-180330-cloudprober/go/src/github.com/google/cloudprober/probes/external/external.go:462 +0x103
github.com/google/cloudprober/probes/external.(*Probe).Start(0xc42030c000, 0xd9bca0, 0xc4200ce000, 0xc420195980)
	/usr/local/google/home/dazwilkin/Projects/dazwilkin-180330-cloudprober/go/src/github.com/google/cloudprober/probes/external/external.go:478 +0x80
created by github.com/google/cloudprober.(*Prober).start
	/usr/local/google/home/dazwilkin/Projects/dazwilkin-180330-cloudprober/go/src/github.com/google/cloudprober/cloudprober.go:248 +0x2b6

I had originally written my own trivial probe and tried it in server mode. When this didn't work (with the same results as above), I started working from the redis sample.

Background
I'm endeavoring to deploy Cloudprober (successfully) and probes (unsuccessfully) to Kubernetes. The approach requires both Cloudprober and the probes to be containerized. In this model, it's preferable (necessary?) to have the probe run in server mode to avoid the waste of spooling up a probe container, making a singular measurement, terminating the container and then recreating another clone. IIUC this is the reasoning for the server mode probe too.

[feature] Prepend a prefix to all metrics

A meaningful prefix, e.g. cloudprober_, would help to distinguish the cloudprober metrics from other metrics. It should be prepended to all stats, automatically.

Add surfacer for InfluxDB

It should be easy to add a surfacer for InfluxDB. InfluxDB measurements correspond to EventMetrics (EM) in cloudprober. "tag keys" and "tag values" correspond to EM labels, and "field keys" and "field values" correspond to EM metrics.

Float metrics values overflow on ARM systems

When compiled for ARM, metrics values are overflowing when they are cast from float64 to int. This seems to be caused by ARM using int32 instead of int64 during the conversion. This results in high-value metrics getting set to either 2147483.647 or 0 (depending on which GOARM value is used) as soon as the value overflows int32.

The code causing this:

// Truncate the float to 3-digit precision.
// Example: 3.33333333333 -> 3.333
ff := float64(int(f.f*1000)) / 1000
return strconv.FormatFloat(ff, 'f', -1, 64)

This can be fixed in a few ways:

  1. Don't truncate at all and just output the float as-is.
  2. Specify int64 instead of int. However, this still has a chance of overflow for large values.
  3. Instead of using the int cast and str.FormatFloat, use fmt.Sprintf("%.3f", f.f). This performs the truncate but does not remove the trailing zeros in the decimal in the same way FormatFloat currently does.
  4. Use fmt.Sprintf, but also perform strings.TrimRight to remove trailing zeros and the decimal point where valid. This is functionally similar to the existing version.

I implemented the fourth option on an incoming pull request. Let me know if you'd prefer a different version of the fix.

A panic occurred when used HTTP probe

I have used multiple HTTP probes with one cloudprober.
But, after some time, panic occurs. (about 30 minutes or 1 hour later)

Environment:
OS: CentOS 7.6.1810
Docker: 18.09.3
Cloudprober: v0.10.1

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4ee41c]

goroutine 45 [running]:
bufio.(*Writer).Flush(0xc00012f680, 0xc000434100, 0x68)
        /home/travis/.gimme/versions/go1.11.linux.amd64/src/bufio/bufio.go:575 +0x5c
bufio.(*Writer).Write(0xc00012f680, 0xc000434100, 0x6a, 0x100, 0x1, 0x9, 0x3)
        /home/travis/.gimme/versions/go1.11.linux.amd64/src/bufio/bufio.go:611 +0xeb
net/http.(*response).write(0xc0004c2000, 0x6a, 0xc000434100, 0x6a, 0x100, 0x0, 0x0, 0x15eebc0, 0x0, 0x0)
        /home/travis/.gimme/versions/go1.11.linux.amd64/src/net/http/server.go:1551 +0x149
net/http.(*response).Write(0xc0004c2000, 0xc000434100, 0x6a, 0x100, 0x3, 0x3, 0x409813)
        /home/travis/.gimme/versions/go1.11.linux.amd64/src/net/http/server.go:1521 +0x56
fmt.Fprintf(0xf65da0, 0xc0004c2000, 0xea1527, 0x9, 0xc00069be00, 0x3, 0x3, 0x6a, 0x0, 0x0)
        /home/travis/.gimme/versions/go1.11.linux.amd64/src/fmt/print.go:189 +0xa5
github.com/google/cloudprober/surfacers/prometheus.New.func1(0xf65da0, 0xc0004c2000, 0xc000246300, 0xc0002e8a20, 0x56)
        /home/travis/gopath/src/github.com/google/cloudprober/surfacers/prometheus/prometheus.go:131 +0x1bf
github.com/google/cloudprober/surfacers/prometheus.(*PromSurfacer).writeData(0xc0003ea120, 0xf65da0, 0xc0004c2000)
        /home/travis/gopath/src/github.com/google/cloudprober/surfacers/prometheus/prometheus.go:360 +0x24d
github.com/google/cloudprober/surfacers/prometheus.New.func3(0xc0003ea120, 0xc0003ea180)
        /home/travis/gopath/src/github.com/google/cloudprober/surfacers/prometheus/prometheus.go:150 +0x12c
created by github.com/google/cloudprober/surfacers/prometheus.New
        /home/travis/gopath/src/github.com/google/cloudprober/surfacers/prometheus/prometheus.go:144 +0x1d7

How do you use Cloudprober?

If you use Cloudprober, would you mind sharing how you use it[1]? This will help in two ways:

  • We'll be able to plan its roadmap better. We'll know what are the features people care about and what we should work on.
  • It will give us a warm fuzzy feeling and motivate us to do more 😄

[1] - For example, do you use targets discovery? Which monitoring system do you integrate with -- prometheus, stackdriver, or something else entirely? Which probe types? Do you run it in a docker container or just a vanilla binary etc.

Community Grafana Prometheus Dashboard

While creating grafana dashboards for cloudprober metrics are not very hard, I would personally really enjoy seeing how other people view cloudprobers output. I was hoping that someone already had a grafana prometheus dashboard that they would be OK with publishing?
Thank you

Non-Datagram Packet Ping fails with multiple targets

On the current master branch (667a6db), when the setting use_datagram_socket: false with multiple targets, the ping begins failing with the error packet too small: size (4) < minPacketSize (16). If two targets are specified, it seems like half of the packets start failing. If three or more targets are listed, no packets from the third+ targets will succeed.

This can be seen using this config:

probe {
    name: "ping"
    type: PING
    targets {
      host_names: "google.com, github.com"
    }
    ping_probe {
      use_datagram_socket: false
    }
  }

The errors will be:

W0320 11:25:40.905329    5273 ping.go:339] packet too small: size (4) < minPacketSize (16)
W0320 11:25:41.875763    5273 ping.go:331] read ip4 0.0.0.0: i/o timeout
W0320 11:25:42.906471    5273 ping.go:339] packet too small: size (4) < minPacketSize (16)
W0320 11:25:43.874060    5273 ping.go:331] read ip4 0.0.0.0: i/o timeout
W0320 11:25:44.904014    5273 ping.go:339] packet too small: size (4) < minPacketSize (16)
W0320 11:25:45.874674    5273 ping.go:331] read ip4 0.0.0.0: i/o timeout
W0320 11:25:46.903673    5273 ping.go:339] packet too small: size (4) < minPacketSize (16)
W0320 11:25:47.874429    5273 ping.go:331] read ip4 0.0.0.0: i/o timeout
W0320 11:25:48.903609    5273 ping.go:339] packet too small: size (4) < minPacketSize (16)
W0320 11:25:49.873686    5273 ping.go:331] read ip4 0.0.0.0: i/o timeout

And the metrics will show:

#TYPE total counter
total{ptype="ping",probe="ping",dst="google.com"} 160
total{ptype="ping",probe="ping",dst="github.com"} 160
#TYPE success counter
success{ptype="ping",probe="ping",dst="google.com"} 160
success{ptype="ping",probe="ping",dst="github.com"} 80

Partial configuration reload

Hi,
we are considering usage of Cloudprober in our organization as a part of ErrorBudget tool. We would like to use it to gather metrics from Elastic (numbers of failed/succeeded requests etc.). Second metric source would be active probing services healthchecks. Implementing probe for Elastic shouldn't be a problem. And we can use http probe for healthchecks. But metrics can be changed by users as they add new or change their SLOs. Restarting whole Cloudprober every time metrics change will not work as we would lose too much data.
Do you think we can use Cloudprober for our use case? Is there any way to change configs only for a part of probes? We considered extending Cloudprober but it seems that we would have to rewrite a lot of code as all probes are using the same context.

Would #149 solve our problem? Are there plans to add it in foreseeable future?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.