Comments (8)
Hello. I have a question. The qp_num
in ibv_wc
is used to identify a QP uniquely. It seems the most useful usecase is when we use a CQ for multiple QPs. I can't understand why this member is relevant for Receive Work Completions that are associated with an SRQ. I can't figure out the relationship bwtween qp_num
and SRQ.
Back to the question. I think the second approach would be more suitable because it's convenient for API users.
from rpma.
The qp_num in ibv_wc is used to identify a QP uniquely. It seems the most useful usecase is when we use a CQ for multiple QPs. I can't understand why this member is relevant for Receive Work Completions that are associated with an SRQ. I can't figure out the relationship bwtween qp_num and SRQ.
In general, a QP can have two CQs (for send completions and recv completions). These CQs can be a single CQ (as it is currently in librpma) or two separate CQs (#737). Each of these CQs might be connected to one or more connections at will.
SRQ is a feature that allows sharing RQ (part of QP) between QPs. Having a shared RQ makes sharing a recv CQ between connections potentially a very common use-case. I think this is why qp_num
is especially relevant for SRQ use-cases. Nonetheless, I hope it is available all the time. :-)
I think the second approach would be more suitable because it's convenient for API users.
I do not like the second idea (allow getting directly struct rpma_conn related to a particular struct rpma_completion) for exactly the same reason I have voted against the internal SRQ table maintained by the librpma library (#737 (comment)). It has to be MT-safe and very efficient since processing completions is the performance-critical path. Whereas MT-safety does not go well with efficiency. As of now, I think the application is better equipped to address this issue in an optimal way.
from rpma.
SRQ is a feature that allows sharing RQ (part of QP) between QPs. Having a shared RQ makes sharing a recv CQ between connections potentially a very common use-case. I think this is why
qp_num
is especially relevant for SRQ use-cases. Nonetheless, I hope it is available all the time. :-)
Thanks for explanation. Does this mean we firstly need to make sure the qp_num
is available in non-SRQ scenario?
I think the second approach would be more suitable because it's convenient for API users.
I do not like the second idea (allow getting directly struct rpma_conn related to a particular struct rpma_completion) for exactly the same reason I have voted against the internal SRQ table maintained by the librpma library (#737 (comment)). It has to be MT-safe and very efficient since processing completions is the performance-critical path. Whereas MT-safety does not go well with efficiency. As of now, I think the application is better equipped to address this issue in an optimal way.
But the first approach also needs a data structure such as hashtable to build relationship between qp_num
and rpma_conn
. Does this data structure is also created and managed by user applications? If this data structure is managed by rpma, we also need to make it MT-safe and efficient enough.
from rpma.
Does this mean we firstly need to make sure the qp_num is available in non-SRQ scenario?
It is very important information for this discussion. :-)
the first approach also needs a data structure such as hashtable to build relationship between qu_num and rpma_conn
No. It is not. You can imagine an application having few connections and looking for a connection might be a lot simpler using e.g. a table or set of comparisons if the number of connections is known upfront.
from rpma.
Does this mean we firstly need to make sure the qp_num is available in non-SRQ scenario?
It is very important information for this discussion. :-)
After reading the implementation of ibv_cq
, I find there are two APIs to poll CQ. One is ibv_poll_cq
and the other is ibv_start_poll
. The latter will call mlx5_parse_cqe()
with the last parameter being set to 1
:
mlx5_parse_cqe(cq, cqe64, cqe, &cq->cur_rsc, &cq->cur_srq, NULL, cqe_ver, 1);
If the last argument is 1
, mlx5_parse_cqe
will use lazy mode
which won't fill wc->qp_num
.
qpn = be32toh(cqe64->sop_drop_qpn) & 0xffffff;
if (lazy) {
cq->cqe64 = cqe64;
cq->flags &= (~MLX5_CQ_LAZY_FLAGS);
} else {
wc->wc_flags = 0;
wc->qp_num = qpn; // qp num
}
The lazy mode is introduced in this patch. I haven't understood it completely but it seems qp_num
may not be reliable.
from rpma.
You are right that the MLX implementation of the mlx5_parse_cqe()
can set or not set qp_num
depending on lazy
argument. But you have also noticed that lazy == 1
only for mlx5_parse_lazy_cqe()
which is used only by mlx5_start_poll()
and mlx5_next_poll()
which are NOT part of the standard ibv_poll_cq()
implementation.
The mlx5_start_poll()
and mlx5_next_poll()
functions are part of MLX-specific interface which we should not use at least till we decide to incorporate any MLX-specific optimizations. This is not the case right now.
Ref: https://manpages.debian.org/stretch/libibverbs-dev/ibv_create_cq_ex.3.en.html
TL;DR: Don't worry about it. qp_num
is set for mlx5_poll_one()
-> poll_cq()
-> mlx5_poll_cq()
-> ibv_poll_cq()
.
from rpma.
Hi @janekmi
How about the following design:
/* get qp_num from rpma_conn object */
int rpma_conn_get_qp_num(const struct rpma_conn *conn, uint32_t *qp_num)
{
if (conn == NULL || == NULL)
return RPMA_E_INVAL;
*qp_num = conn->id->qp->qp_num;
return 0;
}
/* add qp_num in struct rpma_completion */
struct rpma_completion {
...
uint32_t qp_num;
};
/* copy the qp_num from struct ibv_wc to struct rpma_completion */
int rpma_cq_get_completion(struct rpma_cq *cq, struct rpma_completion *cmpl)
{
...
cmpl->qp_num = wc.qp_num;
}
Note: let applications compare two qp_num by themselves.
from rpma.
Resolved by #1087
from rpma.
Related Issues (20)
- FEAT: an alternative approach to get the latest work completion directly from rpma_conn_wait() [DRAFT] HOT 1
- Redundant code shall be remove from rpma_conn_req_destroy and rpma_conn_req_reject HOT 1
- numactl: execution of `fio': No such file or directory HOT 2
- FEAT: manual control of completion events generation [DRAFT]
- examples: unnecessary rpma_conn_delete() after failed rpma_conn_req() HOT 1
- Bad throughput performance in flush-to-persistent HOT 7
- FEAT: atomic_store() to be use with all set function HOT 1
- examples: RPMA file size limitation HOT 6
- MTT - server prestate should contain also args.threads_num
- Build in Docker Dynamic Library librpma.so.0 Not Found HOT 3
- test: some MT tests run under valgrind's memcheck sometimes hangs HOT 1
- test: pthread_cond_timedwait() failed: Connection timed out
- Automatic updating of documentation at pmem.github.io fails HOT 1
- FEAT: RPMA Fio Engine (server-side) support for offset
- Are there any Windows solution files to build?
- No device found Error HOT 6
- Unify dockerfiles
- Is it time to implement rpma verify based on traditional RDMA API? HOT 1
- Plan to implement rpma flush & atomic_write based on new ibverbs flush & atomic_write APIs HOT 6
- When will the new version of librpma be released?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rpma.