Git Product home page Git Product logo

Comments (23)

ziyanTOP avatar ziyanTOP commented on May 27, 2024 3

一样的问题,minor gc的频率跟不上老年代增长的速度,最后三个fe节点全部查询排队超时卡死宕机,建议用prometheus+grafana监控fe的JVM,看看到底问题出在哪,顺便改下你的参数,年轻代等于老年代的1/3,并且不要用-XX:NewRatio=3这种,而是固定设置成-Xmn16G,打开CMS的并行重标记,不然minor gc那点时间这么多内存根本标记不完,然后调低CMS初始化时的内存占比,80%太靠后了,可能gc没完成服务就down了,可以改成60或者65,实测有效,我的集群调整完至今没有fe宕机

from doris.

ziyanTOP avatar ziyanTOP commented on May 27, 2024 1

JAVA_OPTS="-server -Xmx64g -Xmn16g -Xms32g -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=15 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=65 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:$DORIS_HOME/log/fe.gc.log.$DATE" @zhbdesign 具体内存大小根据机器的实际值来设置

from doris.

zengxiangqi1031 avatar zengxiangqi1031 commented on May 27, 2024

遇到同样的问题,doris 2.0.2 release

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

JAVA_OPTS="-Djavax.security.auth.useSubjectCredsOnly=false -Xss4m -Xmx64g -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:$DORIS_HOME/log/fe.gc.log.$CUR_DATE"
image

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

只改了-xmx大小 其他都是默认JVM配置

from doris.

liugddx avatar liugddx commented on May 27, 2024

Using G1GC

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

Using G1GC

thanks, i will try

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

btw, i used broker 2.0.2, not 2.0.1.1.

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

image

from doris.

liugddx avatar liugddx commented on May 27, 2024

https://doris.apache.org/zh-CN/docs/1.2/admin-manual/query-profile?_highlight=profile#%E5%90%8D%E8%AF%8D%E8%A7%A3%E9%87%8A

Maybe you need to turn off the global profile SET [GLOBAL] enable_profile=false;

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

After reviewing the source code, the default max_query_profile_num seems to be 100, so it would't keep pushing profile into memory?

from doris.

liugddx avatar liugddx commented on May 27, 2024

After reviewing the source code, the default max_query_profile_num seems to be 100, so it would't keep pushing profile into memory?

I don’t have a detailed understanding yet. You can continue to follow or provide more detailed log information.

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

After restart fe and SET [GLOBAL] enable_profile=false:
image
image
I have a broker load task running from 11:34 to 11:36, which is when the memory is rapidly increasing.

from doris.

liugddx avatar liugddx commented on May 27, 2024

Has this memory problem affected usage? In addition, will the memory be lost by gc?

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

I need to observer the change of memory after closeing the enable_profile. But before closing it, memory just lost a little by gc, after the memory reaches the maximum value set by -xmx, FE will stop serving.
image

from doris.

wj215318 avatar wj215318 commented on May 27, 2024

I need to observer the change of memory after closeing the enable_profile. But before closing it, memory just lost a little by gc, after the memory reaches the maximum value set by -xmx, FE will stop serving. image

how about fe memory after closeing the enable_profile,thanks

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

image
It seems become normal now. Maybe remove profile have some bug. @wj215318

from doris.

wj215318 avatar wj215318 commented on May 27, 2024

image It seems become normal now. Maybe remove profile have some bug. @wj215318

We have encountered the same problem.and now we also closed the profile.yestoday wo dump the jvm data,DBA is analyzing

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

Due to the impact of dumping on the normal use of the cluster, we did not dump the JVM data. If you discover anything after dumping, please share the specific situation here. @wj215318 Thanks!

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

@wj215318 btw, 2.0.2 release don't have this problem.

from doris.

zhbdesign avatar zhbdesign commented on May 27, 2024

一样的问题,minor gc的频率跟不上老年代增长的速度,最后三个fe节点全部查询排队超时卡死宕机,建议用prometheus+grafana监控fe的JVM,看看到底问题出在哪,顺便改下你的参数,年轻代等于老年代的1/3,并且不要用-XX:NewRatio=3这种,而是固定设置成-Xmn16G,打开CMS的并行重标记,不然minor gc那点时间这么多内存根本标记不完,然后调低CMS初始化时的内存占比,80%太靠后了,可能gc没完成服务就down了,可以改成60或者65,实测有效,我的集群调整完至今没有fe宕机

修改后的启动参数可以分享下

from doris.

DA1OOO avatar DA1OOO commented on May 27, 2024

用了G1回收器 调大JVM内存后。目前正常。
image
还是不理解为什么内存增速这么快。

from doris.

ihadoop avatar ihadoop commented on May 27, 2024

dump下来的文件可以上传上来

from doris.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.