Describe the bug I want to replace the job value field which from

I have looked at this <a href="https://github.com/banzaicloud/spark-metrics/issues/17"

Metrics namespace about spark-metrics HOT 11 OPEN

Gnoale commented on June 1, 2024

Metrics namespace

from spark-metrics.

Comments (11)

Gnoale commented on June 1, 2024

I have looked at this issue
But in my case, even the metric name stays set to the app.id value, so each time I run the job, new metrics are created...

from spark-metrics.

stoader commented on June 1, 2024

Instead of --repositories https://raw.github.com/banzaicloud/spark-metrics/master/maven-repo/releases use the maven repository (https://search.maven.org/artifact/com.banzaicloud/spark-metrics_2.11) to ensure that you use the latest version of spark-metrics that matches your Spark version (eg. 2.3-3.0.1)

from spark-metrics.

Gnoale commented on June 1, 2024

I just tried spark-submit --master yarn --queue default --conf spark.metrics.conf=/mnt/code/infra-hdp-test/metrics.conf --deploy-mode cluster --packages org.apache.hadoop:hadoop-aws:2.7.7,org.elasticsearch:elasticsearch-spark-20_2.11:6.5.4,org.apache.hadoop:hadoop-aws:2.7.7,org.elasticsearch:elasticsearch-spark-20_2.11:6.5.4,com.banzaicloud:spark-metrics_2.11:2.3-3.0.1,io.prometheus:simpleclient:0.3.0,io.prometheus:simpleclient_dropwizard:0.3.0,io.prometheus:simpleclient_pushgateway:0.3.0,io.dropwizard.metrics:metrics-core:3.1.2 /tmp/test.py
It fails and no metrics are posted at all
I have no clue in the stack traces

from spark-metrics.

Gnoale commented on June 1, 2024

I tried also with the prometheus 0.8.1 modules version
I 'm checking the libs, I didn't wrote this script test.py
Logs from the node (same exception as with prom modules 0.3.0 fyi):

Traceback (most recent call last):
  File "test.py", line 12, in <module>
    .config(conf=conf)
  File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/sql/session.py", line 173, in getOrCreate
  File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/context.py", line 353, in getOrCreate
  File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/context.py", line 119, in __init__
  File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/context.py", line 181, in _do_init
  File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/pyspark.zip/pyspark/context.py", line 292, in _initialize_context
  File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1525, in __call__
  File "/mnt/disk5/yarn/local/usercache/g.noale/appcache/application_1585733805129_0091/container_e77_1585733805129_0091_02_000001/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoSuchMethodException: com.banzaicloud.spark.metrics.sink.PrometheusSink.<init>(java.util.Properties, com.codahale.metrics.MetricRegistry, org.apache.spark.SecurityManager)

from spark-metrics.

stoader commented on June 1, 2024

the java.lang.NoSuchMethodException: com.banzaicloud.spark.metrics.sink.PrometheusSink error message indicates that the spark-metrics.jar is not available on the node. Please ensure that the jar is downloaded from maven repository and is available on the node.

from spark-metrics.

stoader commented on June 1, 2024

Also *.sink.prometheus.class in your sparm-metrics config seems to be wrong it should be *.sink.prometheus.class=org.apache.spark.banzaicloud.metrics.sink.PrometheusSink (see https://github.com/banzaicloud/spark-metrics/blob/2.3-3.0.1/PrometheusSink.md#how-to-enable-prometheussink-in-spark)

from spark-metrics.

Gnoale commented on June 1, 2024

Thanks for noticing, I don't know why we ended up with something different than in the doc...
However, with the correct class path, without --repositories and the latest package version com.banzaicloud:spark-metrics_2.11:2.3-2.1.0 it still fail but I get differents error :

1. jmx collector enable

*.sink.prometheus.class=org.apache.spark.banzaicloud.metrics.sink.PrometheusSink
# Prometheus pushgateway address
*.sink.prometheus.pushgateway-address-protocol=https
*.sink.prometheus.pushgateway-address=ourpushgateway
#*.sink.prometheus.period=<period> - defaults to 10
#*.sink.prometheus.unit=< unit> - defaults to seconds (TimeUnit.SECONDS)
#*.sink.prometheus.pushgateway-enable-timestamp=<enable/disable metrics timestamp> - defaults to false

# Metrics name processing (version 2.3-1.1.0 +)
#*.sink.prometheus.metrics-name-capture-regex="application_.*"
#*.sink.prometheus.metrics-name-replacement=${spark.app.name}
#*.sink.prometheus.labels=<labels in label=value format separated by comma>

# Support for JMX Collector (version 2.3-2.0.0 +)
*.sink.prometheus.enable-dropwizard-collector=true
*.sink.prometheus.enable-jmx-collector=true
*.sink.prometheus.jmx-collector-config=/mnt/code/infra-hdp-test/jmxCollector.yaml

# Enable HostName in Instance instead of Appid (Default value is false i.e. instance=${appid})
*.sink.prometheus.enable-hostname-in-instance=true

# Enable JVM metrics source for all instances by class name
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
*.source.jvm.class=org.apache.spark.metrics.source.JvmSource

File "/mnt/disk3/yarn/local/usercache/g.noale/appcache/application_1585733805129_0560/container_e77_1585733805129_0560_02_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1067, in start
    self.socket.connect((self.address, self.port))
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused

Also, some metrics are posted

2. jmx collector disable

# Support for JMX Collector (version 2.3-2.0.0 +)
#*.sink.prometheus.enable-dropwizard-collector=true
#*.sink.prometheus.enable-jmx-collector=true
#*.sink.prometheus.jmx-collector-config=/mnt/code/infra-hdp-test/jmxCollector.yaml

py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.lang.ClassNotFoundException: org.apache.spark.banzaicloud.metrics.sink.PrometheusSink

No metrics posted

from spark-metrics.

stoader commented on June 1, 2024

You can not have both dropwizzard and jmx metrics enabled at the same time:

*.sink.prometheus.enable-dropwizard-collector=true
*.sink.prometheus.enable-jmx-collector=true

Enable only one of them. Spark provides more metrics through dropwizzard than through imx, though consuming imx metrics might be easier.

from spark-metrics.

Gnoale commented on June 1, 2024

Ok, now we do not get any exception by setting driver.sink.prometheus.class=org.apache.spark.banzaicloud.metrics.sink.PrometheusSink
I think we have issue with maven because it only works with --repository

But I cannot get the metrics namespace to the right value, i.e the job value in prometheus.

20/04/02 12:27:36 INFO PrometheusSink: metricsNamespace=None, sparkAppName=Some(test.py), sparkAppId=Some(application_1585816535116_0110), executorId=Some(driver)
20/04/02 12:27:36 INFO PrometheusSink: role=driver, job=application_1585816535116_0110``

I tried in the config file METRICS_NAMESPACE= and spark.metrics.namespace=

from spark-metrics.

Gnoale commented on June 1, 2024

Hi ! finally we worked it out by putting manually all the jars in a shared folder

 spark-submit --master yarn --queue default --conf spark.metrics.conf=/mnt/code/infra-hdp-test/metrics.conf --deploy-mode client --jars /mnt/code/infra-hdp-test/spark-metrics_2.11-2.3-3.0.1.jar,/mnt/code/infra-hdp-test/simpleclient-0.3.0.jar,/mnt/code/infra-hdp-test/simpleclient_pushgateway-0.3.0.jar,/mnt/code/infra-hdp-test/metrics-core-3.1.2.jar,/mnt/code/infra-hdp-test/simpleclient_dropwizard-0.3.0.jar,/mnt/code/infra-hdp-test/simpleclient_common-0.3.0.jar  --conf spark.executor.extraClassPath=/mnt/code/infra-hdp-test/spark-metrics_2.11-2.3-3.0.1.jar:/mnt/code/infra-hdp-test/simpleclient-0.3.0.jar:/mnt/code/infra-hdp-test/simpleclient_pushgateway-0.3.0.jar:/mnt/code/infra-hdp-test/metrics-core-3.1.2.jar:/mnt/code/infra-hdp-test/simpleclient_dropwizard-0.3.0.jar:/mnt/code/infra-hdp-test/simpleclient_common-0.3.0.jar /tmp/test.py

And the only way we found to replace the metrics namespace is directly in the app code

conf = (
        SparkConf()
        .set('spark.serializer', 'org.apache.spark.serializer.KryoSerializer')
        .set('spark.metrics.namespace', 'test_namespace')
    )

And now, we need the app.id back in a metric label
And we are back to the point where the interpolation doesn't work in metrics.conf :
I set *.sink.prometheus.labels=appid=${spark.app.id}
I get a label with appid="${spark.app.id}" which is not really usefull :-)

Any advice ?

from spark-metrics.

stoader commented on June 1, 2024

Substitution for *.sink.prometheus.labels is not supported. The labels provided through *.sink.prometheus.labels are passed to Prometheus in their original format. The *.sink.prometheus.labels is meant to be used for static list of labels that you want to have on all metrics in addition to the ones published by Spark. The value of spark.app.id is published under instance label.

/cc @sancyx @baluchicken

from spark-metrics.

Metrics namespace about spark-metrics HOT 11 OPEN

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent