Comments (11)
I have looked at this issue
But in my case, even the metric name stays set to the app.id value, so each time I run the job, new metrics are created...
from spark-metrics.
Instead of --repositories https://raw.github.com/banzaicloud/spark-metrics/master/maven-repo/releases
use the maven repository (https://search.maven.org/artifact/com.banzaicloud/spark-metrics_2.11) to ensure that you use the latest version of spark-metrics that matches your Spark version (eg. 2.3-3.0.1)
from spark-metrics.
I just tried spark-submit --master yarn --queue default --conf spark.metrics.conf=/mnt/code/infra-hdp-test/metrics.conf --deploy-mode cluster --packages org.apache.hadoop:hadoop-aws:2.7.7,org.elasticsearch:elasticsearch-spark-20_2.11:6.5.4,org.apache.hadoop:hadoop-aws:2.7.7,org.elasticsearch:elasticsearch-spark-20_2.11:6.5.4,com.banzaicloud:spark-metrics_2.11:2.3-3.0.1,io.prometheus:simpleclient:0.3.0,io.prometheus:simpleclient_dropwizard:0.3.0,io.prometheus:simpleclient_pushgateway:0.3.0,io.dropwizard.metrics:metrics-core:3.1.2 /tmp/test.py
It fails and no metrics are posted at all
I have no clue in the stack traces
from spark-metrics.
I tried also with the prometheus 0.8.1 modules version
I 'm checking the libs, I didn't wrote this script test.py
Logs from the node (same exception as with prom modules 0.3.0 fyi):
Traceback (most recent call last):
File "test.py", line 12, in <module>
.config(conf=conf)
File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/sql/session.py", line 173, in getOrCreate
File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/context.py", line 353, in getOrCreate
File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/context.py", line 119, in __init__
File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/context.py", line 181, in _do_init
File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/context.py", line 292, in _initialize_context
File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1525, in __call__
File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoSuchMethodException: com.banzaicloud.spark.metrics.sink.PrometheusSink.<init>(java.util.Properties, com.codahale.metrics.MetricRegistry, org.apache.spark.SecurityManager)
from spark-metrics.
the java.lang.NoSuchMethodException: com.banzaicloud.spark.metrics.sink.PrometheusSink
error message indicates that the spark-metrics.jar is not available on the node. Please ensure that the jar is downloaded from maven repository and is available on the node.
from spark-metrics.
Also *.sink.prometheus.class
in your sparm-metrics config seems to be wrong it should be *.sink.prometheus.class=org.apache.spark.banzaicloud.metrics.sink.PrometheusSink
(see https://github.com/banzaicloud/spark-metrics/blob/2.3-3.0.1/PrometheusSink.md#how-to-enable-prometheussink-in-spark)
from spark-metrics.
Thanks for noticing, I don't know why we ended up with something different than in the doc...
However, with the correct class path, without --repositories
and the latest package version com.banzaicloud:spark-metrics_2.11:2.3-2.1.0
it still fail but I get differents error :
1. jmx collector enable
*.sink.prometheus.class=org.apache.spark.banzaicloud.metrics.sink.PrometheusSink
# Prometheus pushgateway address
*.sink.prometheus.pushgateway-address-protocol=https
*.sink.prometheus.pushgateway-address=ourpushgateway
#*.sink.prometheus.period=<period> - defaults to 10
#*.sink.prometheus.unit=< unit> - defaults to seconds (TimeUnit.SECONDS)
#*.sink.prometheus.pushgateway-enable-timestamp=<enable/disable metrics timestamp> - defaults to false
# Metrics name processing (version 2.3-1.1.0 +)
#*.sink.prometheus.metrics-name-capture-regex="application_.*"
#*.sink.prometheus.metrics-name-replacement=${spark.app.name}
#*.sink.prometheus.labels=<labels in label=value format separated by comma>
# Support for JMX Collector (version 2.3-2.0.0 +)
*.sink.prometheus.enable-dropwizard-collector=true
*.sink.prometheus.enable-jmx-collector=true
*.sink.prometheus.jmx-collector-config=/mnt/code/infra-hdp-test/jmxCollector.yaml
# Enable HostName in Instance instead of Appid (Default value is false i.e. instance=${appid})
*.sink.prometheus.enable-hostname-in-instance=true
# Enable JVM metrics source for all instances by class name
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
*.source.jvm.class=org.apache.spark.metrics.source.JvmSource
File "/mnt/disk3/yarn/local/usercache/g.noale/appcache/application_1585733805129_0560/container_e77_1585733805129_0560_02_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1067, in start
self.socket.connect((self.address, self.port))
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
Also, some metrics are posted
2. jmx collector disable
# Support for JMX Collector (version 2.3-2.0.0 +)
#*.sink.prometheus.enable-dropwizard-collector=true
#*.sink.prometheus.enable-jmx-collector=true
#*.sink.prometheus.jmx-collector-config=/mnt/code/infra-hdp-test/jmxCollector.yaml
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.lang.ClassNotFoundException: org.apache.spark.banzaicloud.metrics.sink.PrometheusSink
No metrics posted
from spark-metrics.
You can not have both dropwizzard and jmx metrics enabled at the same time:
*.sink.prometheus.enable-dropwizard-collector=true
*.sink.prometheus.enable-jmx-collector=true
Enable only one of them. Spark provides more metrics through dropwizzard than through imx, though consuming imx metrics might be easier.
from spark-metrics.
Ok, now we do not get any exception by setting driver.sink.prometheus.class=org.apache.spark.banzaicloud.metrics.sink.PrometheusSink
I think we have issue with maven because it only works with --repository
But I cannot get the metrics namespace to the right value, i.e the job value in prometheus.
20/04/02 12:27:36 INFO PrometheusSink: metricsNamespace=None, sparkAppName=Some(test.py), sparkAppId=Some(application_1585816535116_0110), executorId=Some(driver)
20/04/02 12:27:36 INFO PrometheusSink: role=driver, job=application_1585816535116_0110``
I tried in the config file METRICS_NAMESPACE=
and spark.metrics.namespace=
from spark-metrics.
Hi ! finally we worked it out by putting manually all the jars in a shared folder
spark-submit --master yarn --queue default --conf spark.metrics.conf=/mnt/code/infra-hdp-test/metrics.conf --deploy-mode client --jars /mnt/code/infra-hdp-test/spark-metrics_2.11-2.3-3.0.1.jar,/mnt/code/infra-hdp-test/simpleclient-0.3.0.jar,/mnt/code/infra-hdp-test/simpleclient_pushgateway-0.3.0.jar,/mnt/code/infra-hdp-test/metrics-core-3.1.2.jar,/mnt/code/infra-hdp-test/simpleclient_dropwizard-0.3.0.jar,/mnt/code/infra-hdp-test/simpleclient_common-0.3.0.jar --conf spark.executor.extraClassPath=/mnt/code/infra-hdp-test/spark-metrics_2.11-2.3-3.0.1.jar:/mnt/code/infra-hdp-test/simpleclient-0.3.0.jar:/mnt/code/infra-hdp-test/simpleclient_pushgateway-0.3.0.jar:/mnt/code/infra-hdp-test/metrics-core-3.1.2.jar:/mnt/code/infra-hdp-test/simpleclient_dropwizard-0.3.0.jar:/mnt/code/infra-hdp-test/simpleclient_common-0.3.0.jar /tmp/test.py
And the only way we found to replace the metrics namespace is directly in the app code
conf = (
SparkConf()
.set('spark.serializer', 'org.apache.spark.serializer.KryoSerializer')
.set('spark.metrics.namespace', 'test_namespace')
)
And now, we need the app.id back in a metric label
And we are back to the point where the interpolation doesn't work in metrics.conf
:
I set *.sink.prometheus.labels=appid=${spark.app.id}
I get a label with appid="${spark.app.id}"
which is not really usefull :-)
Any advice ?
from spark-metrics.
Substitution for *.sink.prometheus.labels
is not supported. The labels provided through *.sink.prometheus.labels
are passed to Prometheus in their original format. The *.sink.prometheus.labels
is meant to be used for static list of labels that you want to have on all metrics in addition to the ones published by Spark. The value of spark.app.id
is published under instance
label.
/cc @sancyx @baluchicken
from spark-metrics.
Related Issues (20)
- Pushgateway Read timed out HOT 2
- Want to understand that this spark matrics repo will work with prometheus in Hadoop cluster ? HOT 4
- Release Spark provided fix to maven HOT 2
- No Metrics From Spark Executors (Classes are being instantiated) HOT 10
- Metrics name pre-processing by custom Prometheus sink is working for only one component(driver/executor/applicationMaster) HOT 10
- Filter metrics HOT 10
- Metric Name RegEx Replacement doesn't work with JMX HOT 2
- Spark Metrics Stop Pushing After Pushgateway Restarts
- Configure sink to stop sending job as label/group-key HOT 3
- Metrics filter doesn't work
- Add remote_write to PrometheusSink
- Security Policy violation Binary Artifacts HOT 315
- Security Policy violation Branch Protection HOT 314
- Prometheus Sink is not working with SparkPi
- VictoriaMetrics HOT 2
- Repetitions of last metric value HOT 2
- Adding the ability to set custom labels on metrics
- com.banzaicloud:spark-metrics_2.12:3.1-1.0.0 version Not published to maven central HOT 3
- Only driver metrics visible on local
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spark-metrics.