Comments (21)
@tamalsaha, Yeah might be we also saw high cpu usage when memory growth was also high. From your call graph it seems most of the time it spend reading the setting frame , writing setting frame and writing setting frame ack. So it might be related to what i see .
from cmux.
we don't see the memory leak any more . even if it is leaking else where the reason pprof is showing as cmux is because that is the where the allocation happens when a new request comes. But now we dont see the leak any more . we set both framer.SetMaxReadFrameSize(65536) and looped only 128 times as you suggested before.
i wonder they have some impact.
Thanks for your patience.
from cmux.
Well, even if the code path is the major source of allocations in pprof, it doesn't really mean there is a legit leak.
Having said that, I can have two explanations:
- A client or a collection of clients are sending frames that are neither settings nor header.
- A client or a collection of clients are sending very large frames.
I'm not sure which one is the root-cause. Can you please apply the following (hacky) patch to your project and test if you see a major change? Feel free to increase the 64K and 128 hardcoded values if needed.
diff --git a/matchers.go b/matchers.go
index 485ede8..74a3d10 100644
--- a/matchers.go
+++ b/matchers.go
@@ -146,6 +146,7 @@ func matchHTTP2Field(w io.Writer, r io.Reader, name, value string) (matched bool
done := false
framer := http2.NewFramer(w, r)
+ framer.SetMaxReadFrameSize(65536)
hdec := hpack.NewDecoder(uint32(4<<10), func(hf hpack.HeaderField) {
if hf.Name == name {
done = true
@@ -154,7 +155,7 @@ func matchHTTP2Field(w io.Writer, r io.Reader, name, value string) (matched bool
}
}
})
- for {
+ for i := 0; i < 128; i++ {
f, err := framer.ReadFrame()
if err != nil {
return false
@@ -181,4 +182,5 @@ func matchHTTP2Field(w io.Writer, r io.Reader, name, value string) (matched bool
return matched
}
}
+ return matched
}
from cmux.
Yeah we have Java clients which connect. In the second scenario when you say large frame what is the size you expect them to be. I don't expect that to happen, because the memory growth is over a time frame , typically over a span of 15 days. So can we rule that out ?
Coming to clients which are neither sending setting nor header framer , i am suspecting may be some java clients are behaving like that. i couldn't reproduce the scenario with go clients in my tests.
Thanks for the workaround. I will try it and let you know. But it would take a while before i figure out since memory growth happens slowly .
from cmux.
Hello soheilhy ,
I have more details about when this memory growth might happen since we see it consistently in our production. I tried using the above fix but that makes all the client having their connections rejected. That opened a interesting aspect by all the clients are sending the correct headers frames so late after 128 frames.
So we tried simulating that using the sample grpc hello world program and these were our observations. Please download the latest sample client and server from
https://github.com/chandradeepak/memgrowth.
I have your fix of code with couple of print statements in
memgrowth/server/vendor/github.com/soheilhy/cmux/matchers.go
I tweaked to found the when i would be able to find the frame when multiple connections happen. so i am doing a for ever loop but once it finds the frame i print the count of it. with multiple connections we see there is a increase in the count of loop when correct frame would appear.
And i have seen the count go up to 1000K as well. If you consider if for one connection it reads 1MM frames you can expect there would be memory growth and that would leak .
One other notable mention here is i didn't see this with the sample hello world pb. So i just added a new rpc with bidirectional stream called send. That triggered it.
That is there in the directory server/vendor/google.golang.org/grpc/examples/helloworld/helloworld/helloworld.proto
rpc Send (stream HelloRequest) returns (stream HelloReply) {}
we have a pool of connections which create multiple streams and send data one after another , assuming we a new bufferedReader for each MuxConn. So multiple connections shouldn't impact each other.
please run the sample client and server to find out more. I am not sure where the problem is . might be in grpc .
Please let us know what you think and let me know if you need help in reproducing.
from cmux.
One thing i found out is in the
func matchHTTP2Field(w io.Writer, r io.Reader, name, value string) (matched bool) { case *http2.SettingsFrame: if !f.IsAck(){ if err := framer.WriteSettings(); err != nil { return false } } }
if we do ignore the settings frame with acks we can get rid of what i saw earlier. seems like we are trying to accept setting frames with acks as well which caused this . But setting frames are no payload frames so that doesn't answer the memory increase . still debugging
from cmux.
@chandradeepak , any chance #42 is also related?
from cmux.
I'm still trying to understand if this is expected or not. After all, I need to dump the setting frame, and that consumes more CPU and memory.
Can you guys test what happens if we changeWriteSettings
to WriteSettingsAck
in this line:
Line 165 in bf4a8ed
from cmux.
sure i will test it and let you know.
from cmux.
Is there any progress on this issue?
from cmux.
We have stopped using cmux because of the cpu usage issue.
from cmux.
Sorry to hear that, but you are using m.MatchWithWriters(...)
and that's the expected consequence. you're problem will go away, if you use m.Match
.
from cmux.
The issue is we have Java client and that needs MatchWithWriters
.
from cmux.
Hi
in our case i deployed the fix in some nodes , but due to load balancing the traffic is not routed to the nodes where the fix is there . I will come back with the results once we get real traffic . sorry for the delay.
from cmux.
Thanks Chandra!
from cmux.
@soheilhy , do you know/think why Java clients need that settings frame? Do you think it will be worth filing bug to remove this requirements with the java-grpc repo?
from cmux.
@tamalsaha it's not needed to be implemented the way Java gRPC did, and it's standard conformant. It's really worth trying to fix it there. Thanks!
from cmux.
I filed a bug and cc-ed you. We will see what they say. Thanks!
from cmux.
I just merged a patch which might explain why you see the leak. @chandradeepak, could you try with head, again, please?
from cmux.
Sure @soheilhy i will try it out. Recently we haven't been seeing the issue and it rarely occurs in some machines . so i will deploy this fix in the next push of ours and let you know if that fixes the problem. But it takes some time before it reproduces .
Thanks for the fix .
from cmux.
Thank you @chandradeepak for the confirmation. I have a hunch that commit 0068a46 has resolved the root cause. would you mind reverting framer.SetMaxReadFrameSize(65536) and the 128 limit on the loop for a quick test?
from cmux.
Related Issues (20)
- Possibly move examples_test.go to examples package HOT 1
- Server unexpectedly doesn't receive traffic from cmux.Any() HOT 1
- Udp support?
- Add benchmarks
- badly-behaved client can deadlock server stopping HOT 3
- Cannot get Matchers to mux h2c gRPC and HTTP2(REST) requests HOT 4
- It's hard to extend protocol HOT 3
- how to write client codeto gprc connect HOT 2
- Doesn't work with grpc-go's hello world example HOT 2
- Pull In Downstream Fixes HOT 1
- Cannot check closed error with errors.Is HOT 1
- Multiplexing gRPC and Prometheus metrics services using Google Cloud Run
- Matching based on URL HOT 1
- Cmux example HOT 1
- Close not working properly on gRPC graceful stop HOT 1
- Multiplexing gRPC & HTTP handlers with H2C+Prior Knowledge HOT 4
- socks5 matcher not working HOT 2
- http2 matcher not working with net/http HOT 2
- I don't understand why calling source.Read() might block?
- HTTP1Fast matcher omits PATCH HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cmux.