Comments (2)
NCCL is strictly a collective communication library. It provides optimized routines for operations such as broadcast or all-reduce among multiple GPUs, but not any mechanisms for task distribution or control. If you are just starting out with multi-GPU programming, CUDA-aware MPI might be a good place to start. Here is a presentation from last year's GTC that may be helpful.
from nccl.
Thx for article and explanation, and yes, I am exploring this world of "beyond single gpu" in the moment.
from nccl.
Related Issues (20)
- NCCL with WARN socketTryAccept: Accept failed: Bad file descriptor HOT 5
- Some questions about how NCCL uses IB network for data transmission
- Data transfer from shared buffer to network
- Is there any option to use copy engine in ncclSend and ncclRecv ?
- RuntimeError: NCCL error: internal error - please report this issue to the NCCL developers HOT 4
- Will ncclSend, ncclRecv launched in different cuda streams blocking each other?
- Could anyone provide some suggestions to help me optimize my NCCL code for transmitting KV cache to improve performance?
- Why NCCL LL128 proto need to load data twice?
- Issues with Limited HCA Utilization and RDMA in Multi-node Training HOT 6
- Why only flush once using the last non-zero receive?
- Why different shape of tensor can be all reduced when using nccl as backend?
- How I can modify the source code to change the send data size to 16K in IB verbs?
- [Question] Is SendRecv always block GPU?
- Why choose 20.6 as Hopper GPU’s nvlink bandwith? HOT 1
- how to Improve VLLM KVCACHE Transfer Efficiency with NCCL P2P Communication
- How NCCL utilizes shared memory with the dynamic tensor shape varies across training iterations? HOT 7
- NCCL all-reduce test failure due to TL_SHM ERROR
- Documentation: default of NCCL_IB_SPLIT_DATA_ON_QPS is wrong HOT 1
- Which path will be choosen with the Specific TOPO? HOT 3
- Allreduce bus bandwidth is very low and unstable when ECE (enhanced connection establishment) is enabled. HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nccl.