Git Product home page Git Product logo

oneccl's Introduction

oneAPI Collective Communications Library (oneCCL)

Installation   |   Usage   |   Release Notes   |   Documentation   |   How to Contribute   |   License

oneAPI Collective Communications Library (oneCCL) provides an efficient implementation of communication patterns used in deep learning.

oneCCL is integrated into:

oneCCL is part of oneAPI.

Table of Contents

Prerequisites

  • Ubuntu* 18
  • GNU*: C, C++ 4.8.5 or higher.

Refer to System Requirements for more details.

SYCL support

Intel(R) oneAPI DPC++/C++ Compiler with Level Zero v1.0 support.

To install Level Zero, refer to the instructions in Intel(R) Graphics Compute Runtime repository or to the installation guide for oneAPI users.

BF16 support

  • AVX512F-based implementation requires GCC 4.9 or higher.
  • AVX512_BF16-based implementation requires GCC 10.0 or higher and GNU binutils 2.33 or higher.

Installation

General installation scenario:

cd oneccl
mkdir build
cd build
cmake ..
make -j install

If you need a clean build, create a new build directory and invoke cmake within it.

You can also do the following during installation:

Usage

Launching Example Application

Use the command:

$ source <install_dir>/env/setvars.sh
$ mpirun -n 2 <install_dir>/examples/benchmark/benchmark

Setting workers affinity

There are two ways to set worker threads (workers) affinity: automatically and explicitly.

Automatic setup

  1. Set the CCL_WORKER_COUNT environment variable with the desired number of workers per process.
  2. Set the CCL_WORKER_AFFINITY environment variable with the value auto.

Example:

export CCL_WORKER_COUNT=4
export CCL_WORKER_AFFINITY=auto

With the variables above, oneCCL will create four workers per process and the pinning will depend from process launcher.

If an application has been launched using mpirun that is provided by oneCCL distribution package then workers will be automatically pinned to the last four cores available for the launched process. The exact IDs of CPU cores can be controlled by mpirun parameters.

Otherwise, workers will be automatically pinned to the last four cores available on the node.


Explicit setup

  1. Set the CCL_WORKER_COUNT environment variable with the desired number of workers per process.
  2. Set the CCL_WORKER_AFFINITY environment variable with the IDs of cores to pin local workers.

Example:

export CCL_WORKER_COUNT=4
export CCL_WORKER_AFFINITY=3,4,5,6

With the variables above, oneCCL will create four workers per process and pin them to the cores with the IDs of 3, 4, 5, and 6 respectively.

Using oneCCL package from CMake

oneCCLConfig.cmake and oneCCLConfigVersion.cmake are included into oneCCL distribution.

With these files, you can integrate oneCCL into a user project with the find_package command. Successful invocation of find_package(oneCCL <options>) creates imported target oneCCL that can be passed to the target_link_libraries command.

For example:

project(Foo)
add_executable(foo foo.cpp)

# Search for oneCCL
find_package(oneCCL REQUIRED)

# Connect oneCCL to foo
target_link_libraries(foo oneCCL)

oneCCLConfig files generation

To generate oneCCLConfig files for oneCCL package, use the provided cmake/scripts/config_generation.cmake file:

cmake [-DOUTPUT_DIR=<output_dir>] -P cmake/script/config_generation.cmake

Additional Resources

Blog Posts

Workshop Materials

  • oneAPI, oneCCL and OFI: Path to Heterogeneous Architecure Programming with Scalable Collective Communications: recording and slides

Contribute

See CONTRIBUTING for more information.

License

Distributed under the Apache License 2.0 license. See LICENSE for more information.

Security

To report a vulnerability, refer to Intel vulnerability reporting policy.

oneccl's People

Contributors

adk9 avatar dependabot[bot] avatar ksenyako avatar mshiryaev avatar outoftardis avatar sazanovd avatar shirosankaku avatar ykiryano avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.