Comments (5)
Ok, it seems you addressed most of my issues raised.
When you do increase the remote address do you also decrease the length by 1? Otherwise you might try to write outside the area? It seems you checked this already given your statement above.
Do you check each completion queue result if it was successful? If you get an error there that might be easier to debug.
from disni.
I see a few problems with your code. On the remote side:
- You should always send the Rkey and not the Lkey to a remote side (although for most devices these keys are identical)
- You should clear the dataBuf before you send the buffer information (address, length, rkey) to the remote side. Otherwise, the remote write could happen just before you clear the buffer.
- The post receive is not necessary here. In fact it will not do anything. You only need to post receive buffers when you are using SEND at the remote side but for WRITE you don't need to post a receive buffer. If you want to use SEND, you need to make sure that the receive buffer is posted before you send the buffer information. Otherwise, it can happen that the remote side issues a send but there is no buffer posted yet.
- You assume that after you got the receive event the data should be in the buffer but the (first) event only tells you that the receive buffer has been posted and you will never receive an event that you received data into this buffer because as explained above you are not using SEND on the remote side.
The local side looks ok just be aware that the event you get after a successful WRITE only indicates that the WRITE has been successfully sent to the remote side but it doesn't necessarily mean it has completed on the remote side (meaning placed into memory).
You should try to understand the difference between one-sided (WRITE) and two-sided operations (SEND), so that you can see which operation makes the most sense in your case. Generally speaking one-sided operations are faster but harder to use.
from disni.
Thanks for your correction! I will explain based on your point.
- I know I should send rKey, but this code was rewritten based on src/test/java/com/ibm/disni/benchmarks/ReadServer.java, which write
sendBuf.putInt(dataMr.getLkey())
in line 161. Considering rkey and lkey are equal and this is not the main problem, so I ignored the problem. - This does bring about the problem of "remote write could happen just before I clear the buffer". I will pay attention to this point later, but this is also not the main problem.
- I know that post reception is unnecessary for WRITE. I just want the sender to write the data and then notify the receiver through SEND. Because I'm worried that receiver starting to read before the sender writes. Maybe it’s not something to worry about? Also, this is not the main problem.
- I understand this point.
But these all seems are not the main problem causing receiver can't read any data when I change dataWR.getRdma().setRemote_addr(addr)
to dataWR.getRdma().setRemote_addr(addr+1)
.
I originally thought that rkey only had access rights to the first address of the remote data buffer, so when I change the address to addr+1
, I can't access it using rkey and I should access it using the key of addr+1
. But after checking the rdma information, I found that rkey should have access rights to the addresses in the entire area [address, address+len), so this was not the problem.
Then I suspect that the memory space of the data buffer on the remote side is not continuous, and addr+1
does not correspond to dataBuf.position(1)
. But I checked that the dataBuf is allocated through ByteBuffer.allocateDirect()
, and it should be a continuous memory. I checked the continuity of this address through Unsafe
, and it was indeed ok.
So I don't understand what went wrong. I think this is not a problem with my use of WRITE, because addr
can read data, just addr+1
cannot
from disni.
Oh, that's exactly what I missed, the main problem! The length. sgeSend.setLength()
is set to a fixed buffer length in init()
.
Thanks a lot! :)
from disni.
No Problem. As mentioned above, I would always check the completion queue results to see if the command actually succeeded.
from disni.
Related Issues (20)
- Any specific reason why SVCPostRecv.RecvWRMod statefull is immutable? HOT 4
- Usage of IbvMr.lKey vs. rKey on remote side when setting up WRITE/READ operation HOT 2
- Affinity >64
- Use enum types instead of int
- Closing endpoint that was never connected: SEGFAULT HOT 1
- "setting up protection domain, no context found" error when running the examples HOT 4
- Version mismatch of pom and configure.ac
- Problem with RDMAvsTcpBenchmark of DiSNI over SoftRoCE HOT 11
- register memory that larger than 4GB HOT 11
- Need proper handling for RDMA_CM_EVENT_ROUTE_ERROR HOT 10
- work completion event received with wrong value HOT 16
- 执行configure 报错 HOT 12
- UNKNOWN, srcAddress /0.0.0.0:0 HOT 33
- What happened of idea using Pure java implementation from jVerbs? HOT 3
- our target subsystem name does not follow the NVMe spec, but we already fixed with this, how can I change jNVMf code to connect our target well? HOT 1
- blocked when use tcp to connect rdma server HOT 3
- Support raw Ethernet programming?
- RdmaServerEndpoint.accept() holds when RdmaEndpointGroup.close() is called
- performance HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from disni.