Git Product home page Git Product logo

Comments (19)

stoader avatar stoader commented on June 11, 2024

Can you share logs and your metrics.properties?

from spark-metrics.

paskalka avatar paskalka commented on June 11, 2024

from spark-metrics.

stoader avatar stoader commented on June 11, 2024

Based on others experiencing this issue on Yarn (#30 (comment)) looks like that Yarn doesn't distribute these jars to the executor hosts in time thus when executors initialize metrics system these jars are not there. (Note we are running Spark on Kubernetes - Kubernetes being the cluster manager instead of Yarn - and not seeing this issue there).

Can you check the timestamp of when are these jars distributed by Yarn to executor hosts?

from spark-metrics.

paskalka avatar paskalka commented on June 11, 2024

from spark-metrics.

stoader avatar stoader commented on June 11, 2024

@paskalka check Yarn logs to see when and where are the jars copied/distributed to or ssh to executor hosts and check the stats of the file.

from spark-metrics.

stoader avatar stoader commented on June 11, 2024

@paskalka is this still an issue?

from spark-metrics.

paskalka avatar paskalka commented on June 11, 2024

from spark-metrics.

y0908105023 avatar y0908105023 commented on June 11, 2024

I had the same problem,how did you solve it? Adding jars to spark folder?
image
env:spark on yarn
image

from spark-metrics.

stoader avatar stoader commented on June 11, 2024

@y0908105023 are you running Spark on EMR? If so see #30 (comment)

from spark-metrics.

y0908105023 avatar y0908105023 commented on June 11, 2024

@stoader not spark on emr,just on yarn

from spark-metrics.

stoader avatar stoader commented on June 11, 2024

@y0908105023 how does your spark-submit command look like?

from spark-metrics.

y0908105023 avatar y0908105023 commented on June 11, 2024

@stoader
cd /data0/spark/spark-2.2.1-bin && ./bin/spark-submit\ --jars /data0/job/lib/spark-streaming-kafka-0-10_2.11-2.1.0.jar,/data0/job/lib/kafka-clients-0.10.2.1.jar \ --master yarn \ --files /data0/spark/metrics/metrics.properties \ --conf spark.metrics.conf=metrics.properties \ --conf spark.metrics.conf.*.sink.console.class=org.apache.spark.streaming.metrics.sink.HttpSink \ --conf spark.metrics.conf.*.sink.console.url=url \ --deploy-mode cluster \ --name metric_test \ --num-executors 5 \ --driver-memory 2g \ --executor-memory 2g \ --class org.apache.spark.streaming.cache.ReduceByKeyAndCache \ /data0/spark/metrics/lib/spark-test-1.0-SNAPSHOT.jar
image

from spark-metrics.

stoader avatar stoader commented on June 11, 2024

See https://github.com/banzaicloud/spark-metrics/blob/master/PrometheusSink.md for how to pass packages to Spark.

In case of Spark 2.2 you need to pass:

--repositories https://raw.github.com/banzaicloud/spark-metrics/master/maven-repo/releases

--packages com.banzaicloud:spark-metrics_2.11:2.2.1-1.0.0,io.prometheus:simpleclient:0.0.23,io.prometheus:simpleclient_dropwizard:0.0.23,io.prometheus:simpleclient_pushgateway:0.0.23,io.dropwizard.metrics:metrics-core:3.1.2

Also I'd recommend upgrading to Spark 2.3+ and the latest version of spark-metrics library.

from spark-metrics.

y0908105023 avatar y0908105023 commented on June 11, 2024

which version will solve this problem?

from spark-metrics.

stoader avatar stoader commented on June 11, 2024

@y0908105023 I'd suggest using Spark 2.3+ passing --packages com.banzaicloud:spark-metrics_2.11:2.3-2.1.0,io.prometheus:simpleclient:0.3.0,io.prometheus:simpleclient_dropwizard:0.3.0,io.prometheus:simpleclient_pushgateway:0.3.0,io.dropwizard.metrics:metrics-core:3.1.2 to your spark-submit command

from spark-metrics.

prcastro avatar prcastro commented on June 11, 2024

Hey @stoader I just tried your suggestion and it doesn't seem to be working. I still receive the error mentioned above. Funny thing is: the driver seem to have no problem with those packages. So if I do

  --conf spark.metrics.conf.driver.sink.prometheus.class=org.apache.spark.banzaicloud.metrics.sink.PrometheusSink

The application runs without problems (the only problem is that I don't have executor metrics). However, if I run with

  --conf spark.metrics.conf.*.sink.prometheus.class=org.apache.spark.banzaicloud.metrics.sink.PrometheusSink

It raises the exception. Am I using the packages in the wrong manner? How can I solve this? Is manually copying the jars to the executor the only solution?

from spark-metrics.

stoader avatar stoader commented on June 11, 2024

Yes you need to copy the needed jars to the hosts where executors run.

from spark-metrics.

prcastro avatar prcastro commented on June 11, 2024

I ended up just copying the jars and JMX config file to each node using a bootstrap action on EMR

from spark-metrics.

KeithTt avatar KeithTt commented on June 11, 2024

Same problem here.

version: standalone spark 2.4 cluster with zk.

metrics.properties

# Enable Prometheus for all instances by class name
*.sink.prometheus.class=org.apache.spark.banzaicloud.metrics.sink.PrometheusSink
# Prometheus pushgateway address
*.sink.prometheus.pushgateway-address-protocol=http
*.sink.prometheus.pushgateway-address=172.16.68.11:9091
*.sink.prometheus.period=10
# Enable HostName in Instance instead of Appid (Default value is false i.e. instance=${appid})
*.sink.prometheus.enable-hostname-in-instance=true

sprak-conf

spark.jars  /data1/moji/soft/spark-thriftserver-2.4.0/jars/spark-metrics_2.12-2.4-1.0.6.jar

Should I copy the jar to all the workers? There are about 100 nodes.

from spark-metrics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.