uber-common / jvm-profiler Goto Github PK
View Code? Open in Web Editor NEWJVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
License: Other
JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
License: Other
Any reason why there is no StatsdOutputReporter? I'd be interested in writing one as I use Statsd to collect metrics.
I am trying to use uber jvm profiler to profile my spark application (spark 2.4, running on emr 5.21)
Following is my cluster configuration
[
{
"classification": "spark-defaults",
"properties": {
"spark.executor.memory": "38300M",
"spark.driver.memory": "38300M",
"spark.yarn.scheduler.reporterThread.maxFailures": "5",
"spark.driver.cores": "5",
"spark.yarn.driver.memoryOverhead": "4255M",
"spark.executor.heartbeatInterval": "60s",
"spark.rdd.compress": "true",
"spark.network.timeout": "800s",
"spark.executor.cores": "5",
"spark.memory.storageFraction": "0.27",
"spark.speculation": "true",
"spark.sql.shuffle.partitions": "200",
"spark.shuffle.spill.compress": "true",
"spark.shuffle.compress": "true",
"spark.storage.level": "MEMORY_AND_DISK_SER",
"spark.default.parallelism": "200",
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
"spark.memory.fraction": "0.80",
"spark.executor.extraJavaOptions": "-XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=35 -XX:OnOutOfMemoryError='kill -9 %p'",
"spark.executor.instances": "107",
"spark.yarn.executor.memoryOverhead": "4255M",
"spark.dynamicAllocation.enabled": "false",
"spark.driver.extraJavaOptions": "-XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=35 -XX:OnOutOfMemoryError='kill -9 %p'"
},
"configurations": []
},
{
"classification": "yarn-site",
"properties": {
"yarn.log-aggregation-enable": "true",
"yarn.nodemanager.pmem-check-enabled": "false",
"yarn.nodemanager.vmem-check-enabled": "false"
},
"configurations": []
},
{
"classification": "spark",
"properties": {
"maximizeResourceAllocation": "true",
"spark.sql.broadcastTimeout": "-1"
},
"configurations": []
},
{
"classification": "emrfs-site",
"properties": {
"fs.s3.threadpool.size": "50",
"fs.s3.maxConnections": "5000"
},
"configurations": []
},
{
"classification": "core-site",
"properties": {
"fs.s3.threadpool.size": "50",
"fs.s3.maxConnections": "5000"
},
"configurations": []
}
]
The profiler jar is stored in s3 (mybucket/profilers/jvm-profiler-1.0.0.jar
). While bootstrapping my core and master nodes, I run the following bootstrap script
sudo mkdir -p /tmp
aws s3 cp s3://mybucket/profilers/jvm-profiler-1.0.0.jar /tmp/
I submit my emr step as follows
spark-submit --deploy-mode cluster --master=yarn ......(other parameters).........
--conf spark.jars=/tmp/jvm-profiler-1.0.0.jar --conf spark.driver.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.ConsoleOutputReporter,metricInterval=5000 --conf spark.executor.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.ConsoleOutputReporter,metricInterval=5000
But I am unable to see the profiling related output in the logs (checked both stdout and stderr logs for all containers). Is the parameter ignored ? Am I missing something ? Is there something else I could check to see why this parameter is being ignored ?
Hi, We are trying to integrate the jvm-profile in the service mix/apache karaf environment. when we try to use method duration profiler, we got below error.however by default cpuandMemory profiler is working.
java.lang.NoClassDefFoundError: com/uber/profiling/transformers/MethodProfilerStaticProxy
I have tried using the profiler, following the instructions in the README file, but I don't see any output or logs. May be I am missing something.
I am running a pyspark application, using jupyter. (spark version:2.3.2)
.config("spark.jars","/tmp/profileJar/jvm-profiler-1.0.0.jar")
.config("spark.driver.extraJavaOptions","-javaagent:/tmp/profileJar/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.FileOutputReporter,metricInterval=5000,sampleInterval=5000,ioProfiling=true,outputDir=/tmp/ProfilerOut")
.config("spark.executor.extraJavaOptions","-javaagent:/tmp/profileJar/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.FileOutputReporter,metricInterval=5000,sampleInterval=5000,ioProfiling=true,outputDir=/tmp/ProfilerOut")
Please let me know if you have any inputs/suggestions.
mvn clean package
fails to run test cases
[ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.331 s <<< FAILURE! -- in com.uber.profiling.profilers.StacktraceCollectorProfilerTest
[ERROR] com.uber.profiling.profilers.StacktraceCollectorProfilerTest.profile -- Time elapsed: 0.114 s <<< FAILURE!
java.lang.AssertionError: expected:<4> but was:<6>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:633)
at com.uber.profiling.profilers.StacktraceCollectorProfilerTest.profile(StacktraceCollectorProfilerTest.java:101)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
at java.base/java.lang.reflect.Method.invoke(Method.java:578)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:316)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:240)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:214)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:155)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
Other warnings:
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/maven/surefire/common-junit4/3.1.2/common-junit4-3.1.2.jar (26 kB at 453 kB/s)
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running com.uber.profiling.YamlConfigProviderTest
1693218739503 com.uber.profiling.util.ExponentialBackoffRetryPolicy: Retrying (after sleeping 169 milliseconds) on exception: java.lang.RuntimeException: Failed getting url: http://localhost/bad_url
1693218739681 com.uber.profiling.util.ExponentialBackoffRetryPolicy: Retrying (after sleeping 318 milliseconds) on exception: java.lang.RuntimeException: Failed getting url: http://localhost/bad_url
[WARNING] 1693218740009 com.uber.profiling.YamlConfigProvider: Failed to read file: http://localhost/bad_url java.lang.RuntimeException: Failed after trying 3 times
at com.uber.profiling.util.ExponentialBackoffRetryPolicy.attempt(ExponentialBackoffRetryPolicy.java:77)
at com.uber.profiling.YamlConfigProvider.getConfig(YamlConfigProvider.java:74)
at com.uber.profiling.YamlConfigProvider.getConfig(YamlConfigProvider.java:61)
at com.uber.profiling.YamlConfigProviderTest.getConfigFromBadHttpUrl(YamlConfigProviderTest.java:103)
[WARNING] 1693218740596 com.uber.profiling.YamlConfigProvider: Failed to read file: not_exiting_file java.lang.RuntimeException: Failed after trying 3 times
at com.uber.profiling.util.ExponentialBackoffRetryPolicy.attempt(ExponentialBackoffRetryPolicy.java:77)
at com.uber.profiling.YamlConfigProvider.getConfig(YamlConfigProvider.java:74)
at com.uber.profiling.YamlConfigProvider.getConfig(YamlConfigProvider.java:61)
at com.uber.profiling.YamlConfigProviderTest.getConfig(YamlConfigProviderTest.java:43)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
at java.base/java.lang.reflect.Method.invoke(Method.java:578)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
Hi @viirya ,
Sorry for any inconvenience.
I am trying to run Spark in the local mode, along with the JVM profiler, however, I encounter some issue when trying to generate flamegraph of Stacktrack Profiling result. I would appreciate if you can provide some suggestions, thank you.
Best regards,
Run Spark along with the JVM profiler:
cch:target cch$ spark-submit --class com.github.ehiggs.spark.terasort.TeraSort --conf "spark.driver.extraJavaOptions=-javaagent:/Users/cch/eclipse-workspace/jvm-profiler/target/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.FileOutputReporter,ioProfiling=true,outputDir=/Users/cch/Downloads/Stacktrace.json" --conf "spark.executor.extraJavaOptions=-javaagent:/Users/cch/eclipse-workspace/jvm-profiler/target/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.FileOutputReporter,ioProfiling=true,outputDir=/Users/cch/Downloads/Stacktrace.json" /Users/cch/eclipse-workspace/spark-terasort/target/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar /Users/cch/Downloads/terasort_in /Users/cch/Downloads/terasort_out
Generate flamegraph of Stacktrack Profiling result:
cch:target cch$ python /Users/cch/eclipse-workspace/jvm-profiler/stackcollapse.py -i /Users/cch/Downloads/Stacktrace.json/CpuAndMemory.json > /Users/cch/Downloads/StacktraceOut/CpuAndMemory.folded
Traceback (most recent call last):
File "/Users/cch/eclipse-workspace/jvm-profiler/stackcollapse.py", line 17, in <module>
assert 'stacktrace' in stacktraceLog, "Malformated json. 'stacktrace' key doesn't exist."
AssertionError: Malformated json. 'stacktrace' key doesn't exist.
Hi,
I am trying to run Spark in the client mode, along with the JVM profiler, and I am not able to see any output in the yarn logs. The link below says that the spark.executor.extraJavaOptions does not work in the client mode, even though Spark documentation says its the spark.driver.extraJavaOptions variable that doesn't work.
I referred to the link below to see how the values need to be specified
My spark-defaults.conf has the following line
spark.jars=hdfs:///user/lib/jvm-profiler-1.0.0.jar
spark.executor.extraJavaOptions -Djava.library.path=/home/spark-user/jvm-prof/jvm-profiler/target/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.ConsoleOutputReporter,tag=spark-bench,metricInterval=5000,sampleInterval=100
Does this look ok? Please let me know. And where exactly can I see the profiler output.
Thanks
While building the jar for running with influx, the following error is seen:
[ERROR] /Users/mukund.vemuri/IdeaProjects/uber-common/jvm-profiler/src/main/java_influxdb/com/uber/profiling/reporters/InfluxDBOutputReporter.java:[9,20] cannot find symbol
[ERROR] symbol: class BatchOptions
[ERROR] location: package org.influxdb
[ERROR] /Users/mukund.vemuri/IdeaProjects/uber-common/jvm-profiler/src/main/java_influxdb/com/uber/profiling/reporters/InfluxDBOutputReporter.java:[150,39] cannot find symbol
[ERROR] symbol: variable BatchOptions
[ERROR] location: class com.uber.profiling.reporters.InfluxDBOutputReporter
It appears that the version of influxdb-java is old:
<dependency>
<groupId>org.influxdb</groupId>
<artifactId>influxdb-java</artifactId>
<version>2.7</version>
</dependency>
To resolve this, it is recommend to update the influxdb-java dependency in the pom.
Hi Team,
It will be good if anyone has any example related to how to use it with spark and HDFS
Thanks,
Sohil Raghwani
spark-submit \
--master yarn \
--deploy-mode client \
--executor-memory 10G \
--executor-cores 3 \
--driver-memory 5g \
--conf "spark.driver.extraJavaOptions=-javaagent:/data/leo_jie/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.InfluxDBOutputReporter,tag=influxdb,configProvider=com.uber.profiling.YamlConfigProvider,configFile=/data/leo_jie/Influxdb.yaml,metricInterval=5000,sampleInterval=5000,ioProfiling=true" \
--files /data/leo_jie/Influxdb.yaml \
--jars /data/leo_jie/jvm-profiler-1.0.0.jar \
/data/leo_jie/spark_test.py
The spark application won't exit automatically when the application finished!
This ticket is a suggestion for improving the documentation by adding a description of the metrics collected by the profiler.
As noted in #50, there is currently no documentation describing the metrics that are collected.
Adding a description of what each collected metric represents to the documentation might add value by:
The recommendation is to add something along the lines of the sample below.
Name | Description | Sample Value | Note |
---|---|---|---|
bufferPools-direct-totalCapacity | Available capacity in bytes in the direct buffer pool. | 18949 | Capacity should be similar to memoryUsed for a healthy JVM. |
gc-PSMarkSweep-collectionCount | Total number of collections that have occurred for PS MarkSweep. | 7 | -1 if collection count is undefined for this collector. |
I have started some work on this, which can be found here. The information provided in the mark-down is based on the source code of the profiler, the java.lang
documentation of the APIs that are used by the profiler and personal knowledge.
The documentation linked above is incomplete since I wanted to ensure that this change is acceptable before spending more time on it.
Could more details be given on how to use the Profiler (with different Consoles) for a spark application ?
I could find the details in the Readme for a normal Java Application, but the details on how to use it with a spark application could not be found.
I am trying to use the JVM profiler on a supercomputing cluster. To run the tool it requires the HDFS URL but the admins who are helping me with setting up the tool on the cluster would like to know if this tool can be run on a regular filesystem since using the HDFS URL could open up security issues from their end. I appreciate your help in this regard.
Is it possible to integrate it with Spring MVC + Tomcat app
Hello,
I have been using the JVM profiler on a supercomputing cluster for a bit now. The spark.driver.extraJavaOptions string seems to work fine and the Console Reporter writes profiling information to the output log file. When I use spark.executor.extraJavaOptions string the profiler doesn't seem to work. Spark on the cluster is setup as Standalone and runs in client mode. I was hoping if you could help me with this.
The command I use to run the profiler is:-
spark-submit --executor-cores 8 --driver-memory 20G --executor-memory 20G --conf spark.driver.extraJavaOptions=-javaagent:/home/rvenka21/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.ConsoleOutputReport --conf spark.executor.extraJavaOptions=-javaagent:/home/rvenka21/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.ConsoleOutputReport --class Main "tool to be profiled"
Log information:-
Java Agent 1.0.0 premain args: reporter=com.uber.profiling.reporters.ConsoleOutputReporter
1554139518375 com.uber.profiling.Arguments: Got argument value for reporter: com.uber.profiling.reporters.ConsoleOutputReporter
ConsoleOutputReporter - ProcessInfo: {"jvmInputArguments":"","role":"driver","jvmClassPath":"","epochMillis":1554139518509,"cmdline":"/usr/java/latest/bin/java -cp /home/rvenka21/mycluster.conf/spark/:/share/apps/compute/spark/spark-2.4.0-bin-hadoop2.6/jars/*:/home/rvenka21/mycluster.conf/ -Xmx20G -javaagent:/home/rvenka21/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.ConsoleOutputReporter org.apache.spark.deploy.SparkSubmit --conf spark.driver.memory=20G --conf spark.executor.extraJavaOptions=-javaagent:/home/rvenka21/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.ConsoleOutputReporter --conf spark.driver.extraJavaOptions=-javaagent:/home/rvenka21/jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.ConsoleOutputReporter --class Main --executor-cores 8 --executor-memory 20G /oasis/projects/nsf/uic367/rvenka21/Genomics_EpiQuant/SPAEML/target/SPAEML-0.0.1-jar-with-dependencies.jar StepwiseModelSelection --epiq /user/input/SNP.200.Sample.300.Genotype.epiq -P /user/input/SNP.200.Sample.300.Phenotype.epiq -o /user/input/cluster_mode ","appId":null,"name":"[email protected]","host":"comet-03-02","processUuid":"bca1b961-e27a-42bd-ad57-bedbb8d4c095","agentVersion":"1.0.0","appClass":"Main","xmxBytes":21474836480,"appJar":null}
1554139519286 com.uber.profiling.AgentImpl: Finished one time profiler: com.uber.profiling.profilers.ProcessInfoProfiler@5427c60c
With Regards,
Ram
Could you please provide some examples of the config file?
I think FileOutputReporter can't write to S3 or HDFS, I might contribute an S3FileOutputReporter if that would be useful for this project...
We should add support for custom user defined metrics.
We could add a drop-wizard profiler.
This is quite useful as it allows a single library for metrics.
ThreadMXBean from the ManagementFactory is capable of providing Thread level metrics in the profiler which could be very useful while trying to debug memory issues as well. Few functions which I have found useful:
threadMXBean.getThreadCount();
threadMXBean.getPeakThreadCount()
threadMXBean.getTotalStartedThreadCount()
I could add these as a PR perhaps in an existing Profiler class or create a new Thread level profiler class.
I would like to get details about number of times a method is invoked, duration for the method execution. I believe I should use 'durationPolling' feature. I am using following command:
-javaagent:jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.FileOutputReporter,outputDir=/tmp/ProfilerOut,ioProfiling=true,sampleInterval=2000,durationProfiling=io.jaal.spark.ingest.Publisher.getProducer
However I am seeing 'ProcessInfo.json' (along with 'CpuAndMemory.json', 'IO.json' and 'Stacktrace.json') and it does not seem to have the above metrics. Am I doing it correctly? Could anyone please help me here?
When using the InfluxDB reporter on Spark 2.2.0 with Ambari HDP, I'm getting the following error.
I ran mvn -P influxdb clean package
to build the jar.
Using the following command (with classname / host replaced):
spark-submit --master yarn-cluster --class com.output.spark.FilteringJob --conf spark.jars=hdfs:///user/smorgasborg/lib/jvm-profiler-1.0.0.jar --conf spark.executor.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.InfluxDBOutputReporter,tag=profiling,sampleInterval=1000,influxdb.host={{HOST HERE}},influxdb.port=8086,influxdb.database=test_profiling --conf spark.yarn.am.waitTime=200s spark-output-assembly.jar
[WARNING] 1544111654753 com.uber.profiling.ProfilerRunner: Failed to run profile: com.uber.profiling.profilers.StacktraceReporterProfiler@b9afc07 java.lang.NoSuchMethodError: okio.ByteString.encodeString(Ljava/lang/String;Ljava/nio/charset/Charset;)Lokio/ByteString;
at okhttp3.Credentials.basic(Credentials.java:35)
at okhttp3.Credentials.basic(Credentials.java:30)
at ujagent_shaded.org.influxdb.impl.BasicAuthInterceptor.<init>(BasicAuthInterceptor.java:15)
at ujagent_shaded.org.influxdb.impl.InfluxDBImpl.<init>(InfluxDBImpl.java:153)
at ujagent_shaded.org.influxdb.impl.InfluxDBImpl.<init>(InfluxDBImpl.java:122)
at ujagent_shaded.org.influxdb.impl.InfluxDBImpl.<init>(InfluxDBImpl.java:185)
at ujagent_shaded.org.influxdb.InfluxDBFactory.connect(InfluxDBFactory.java:48)
at com.uber.profiling.reporters.InfluxDBOutputReporter.ensureInfluxDBCon(InfluxDBOutputReporter.java:148)
at com.uber.profiling.reporters.InfluxDBOutputReporter.report(InfluxDBOutputReporter.java:57)
at com.uber.profiling.profilers.StacktraceReporterProfiler.profile(StacktraceReporterProfiler.java:118)
at com.uber.profiling.ProfilerRunner.run(ProfilerRunner.java:38)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Hi, there are multiple versions of com.fasterxml.jackson.core:jackson-core in jvm-profiler-master. As shown in the following dependency tree, according to Maven's “nearest wins” strategy, only com.fasterxml.jackson.core:jackson-core:2.8.11 can be loaded, com.fasterxml.jackson.core:jackson-core:2.10.0.pr1 will be shadowed.
As com.fasterxml.jackson.core:jackson-core:2.10.0.pr1 has not been loaded during the building process, several methods are missing. However, the missing methods:
1. org.apache.maven.artifact.versioning.ManagedVersionMap: void init (java.util.Map)
2. com.fasterxml.jackson.core.type.WritableTypeId: void init (java.lang.Object,com.fasterxml.jackson.core.JsonToken)
3. com.fasterxml.jackson.core.JsonGenerator: com.fasterxml.jackson.core.type.WritableTypeId writeTypePrefix(com.fasterxml.jackson.core.type.WritableTypeId)
The above missing methods are actually referenced by jvm-profiler-master, which will cause “NoSuchMethodErrors” at runtime.
Suggested fixing solutions:
Please let me know which solution do you prefer? I can submit a PR to fix it.
Thank you very much for your attention.
Best regards,
Dependency tree----
Can I convert json results to any other format or
folded to any other format..?
not working on spark
LogType:stderr
Log Upload Time:Fri May 22 15:36:26 +0800 2020
LogLength:72
Log Contents:
Error opening zip file or JAR manifest missing : jvm-profiler-1.0.0.jar
LogType:stdout
Log Upload Time:Fri May 22 15:36:26 +0800 2020
LogLength:84
Log Contents:
Error occurred during initialization of VM
agent library failed to init: instrument
spark-default.conf
spark.jars=hdfs://my_domain/DW/app/jars/jvm-profiler-1.0.0.jar
spark.executor.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.KafkaOutputReporter,metricInterval=5000,brokerList=10.201.5.57:9092,topicPrefix=monitor.spark.
Hi, I have been using Uber JVM profiler for profiling my Spark applications.
May I know if the profiler provides information on cross node communication?
I'm trying to use jvm-profiler to monitor the spark's executor on yarn. I find the jvm-profiler is very helpful. But I'v still received jvm's memory debug info when execute command below
--conf spark.jars=hdfs://SERVICE-HADOOP-ff0917a859de41be8d3371e3f64b7b9f/lib/jvm-profiler-1.0.0.jar --conf spark.executor.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar=reporter=com.uber.profiling.reporters.KafkaOutputReporter,sampleInterval=3000,brokerList=awaken122:9092,topicPrefix=profiler_
Acutally, I want to get Stacktrack Profiling' to build FlameGraph. Could you help me to build the right command?
While profiling a spark application, if we are using the following setting:
reporter=com.uber.profiling.reporters.FileOutputReporter,outputDir=/tmp/ProfilerOut
, is the output present in all the executor nodes or will it get aggregated and driver writes on its local directory?
I am planning to run some profiing on production test dataset but some engineers do not have access to the machines.
Is there any easy way to co-relate the executor UUID for spark to the spark UI executor ID. I agree adding anything spark specific will defeat the purpose of a generic JVM profiler.
i found that description.. and then?
Hi, I hava an idea like the titile. Because the file is not allowed to keep growing in streaming jobs. So we hava to use rolling file for it. What do you think?
Currently, we report all the metrics at a scheduled interval and it ends up creating a lot of noisy data.
Would it be better to only report metrics after a given Threshold is Breached?
Example: Report HeapUsed
when its > 60% heap Max.
Report GC
when the time spent is more than X.
I find the usage of this tool somewhat limiting. Can you call out the limitations of this framework in the README please. appreciate it.
for example see below stack trace. what is it about ?
Caused by: java.lang.IllegalStateException: Unable to load cache item at org.springframework.cglib.core.internal.LoadingCache.createEntry(LoadingCache.java:79) at org.springframework.cglib.core.internal.LoadingCache.get(LoadingCache.java:34) at org.springframework.cglib.core.AbstractClassGenerator$ClassLoaderData.get(AbstractClassGenerator.java:134) at org.springframework.cglib.core.AbstractClassGenerator.create(AbstractClassGenerator.java:319) at org.springframework.cglib.proxy.Enhancer.createHelper(Enhancer.java:569) at org.springframework.cglib.proxy.Enhancer.createClass(Enhancer.java:416) at org.springframework.aop.framework.ObjenesisCglibAopProxy.createProxyClassAndInstance(ObjenesisCglibAopProxy.java:58) at org.springframework.aop.framework.CglibAopProxy.getProxy(CglibAopProxy.java:205) ... 42 common frames omitted Caused by: java.lang.VerifyError: (class: com/optum/voyager/core/persistence/service/repository/HL7ResourceRepository$$EnhancerBySpringCGLIB$$18453215, method: findHL7Resource signature: (Ljava/lang/String;Ljava/util/UUID;Lcom/optum/voyager/datamodel/enums/HL7ResourceType;)Lcom/optum/voyager/core/persistence/service/entity/HL7Resource;) Inconsistent stack height 2 != 1 at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348)
The wildcard ('*') for durationProfiling doesn't show up in the readme file due to not being properly escaped.
I would like to know how the 'durationProfiling' will work in case of Spark streaming application.
Hi,
It would be really useful to have this distributed via Maven Central.
This is already being done for some other Uber projects (e.g. hudi, rave), so your organisation has already claimed the groupId - hopefully it's a small effort.
Thanks
Hi folks,
It would be great if Uber can open source their analytics framework or at least write a blog on how to use the collected metrics to tune the job, like which config is being tuned and what is the thought process behind the decision making.
Thanks
I have a need for profiling multiple packages. Tried
durationProfiling=com.optum.voyager.core.persistence.service.repository.HL7ResourceRepository.*,com.optum.voyager.core.persistence.service.repository.ElasticServiceImpl.*
with following results
Caused by: java.lang.IllegalArgumentException: Arguments for the agent should be like: key1=value1,key2=value2 at com.uber.profiling.Arguments.parseArgs(Arguments.java:94) at com.uber.profiling.Agent.premain(Agent.java:35)
What is the right way to specify multiple packages?
Could you please provide some material/ links that explains what each field means in the profiler output.
I am able to get profiler output for my spark streaming application, but finding it hard to understand the fields and its meaning.
I am specifically interested in CpuAndMemory and MethodDuration output fields
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.