rocm / amdmigraphx Goto Github PK

AMD's graph optimization engine.

Home Page: https://rocm.docs.amd.com/projects/AMDMIGraphX/en/latest/

License: MIT License

CMake 1.09% C++ 90.97% Shell 0.12% Python 7.36% Dockerfile 0.07% PureBasic 0.03% C 0.35% Vim Script 0.01%

amdmigraphx's Introduction

AMD MIGraphX

AMD MIGraphX is AMD's graph inference engine, which accelerates machine learning model inference. To use MIGraphX, you can install the binaries or build from source code. Refer to the following sections for Ubuntu installation instructions (we'll provide instructions for other Linux distributions in the future).

Note

You must install ROCm before installing MIGraphX.

Installing from binaries

Install binaries using:

sudo apt update && sudo apt install -y migraphx

Header files and libraries are installed under /opt/rocm-<version>, where <version> is the ROCm version.

Building from source

You have three options for building from source:

ROCm build tool: Uses rbuild to install prerequisites, then you can build the libraries with a single command.
CMake: Uses a script to install prerequisites, then you can use CMake to build the source.
Docker: Builds a Docker image with all prerequisites installed, then you can build the MIGraphX sources inside a Docker container.

Build prerequisites

The following is a list of prerequisites for building MIGraphX.

ROCm CMake modules required
MIOpen for running on the GPU
rocBLAS for running on the GPU
HIP for running on the GPU
Protobuf for reading onnx files
Half, an IEEE 754-based half-precision floating point library
pybind11 for python bindings
JSON for model serialization to json string format
MessagePack for model serialization to binary format
SQLite3 to create database of kernels' tuning information or run queries on existing database

Use the ROCm build tool rbuild.

Install rocm-cmake, pip3, rocblas, and miopen-hip:

sudo apt install -y rocm-cmake python3-pip rocblas miopen-hip

Install rbuild (sudo may be required):

pip3 install https://github.com/RadeonOpenCompute/rbuild/archive/master.tar.gz

Build MIGraphX source code:

rbuild build -d depend -B build -DGPU_TARGETS=$(/opt/rocm/bin/rocminfo | grep -o -m1 'gfx.*')

Once completed, all prerequisites are in the depend folder and MIGraphX is in the build directory.

Note

If you get an rbuild: command not found error, it's because rbuild is installed in $HOME/.local/bin, which is not in PATH. You can either export PATH as export PATH=$HOME/.local/bin:$PATH to add the folder to PATH, or add the option --prefix /usr/local in the pip3 command when installing rbuild.

Use CMake to build MIGraphX

Install the prerequisites:
```
rbuild prepare -d depend
```
This puts all the prerequisites are in depend the folder. They can be used in the cmake configuration as -DCMAKE_PREFIX_PATH=depend.

If you have sudo access, as an alternative to the rbuild command, you can install the prerequisites in the same way as a Dockerfile, by calling ./tools/install_prereqs.sh.

By default, all prerequisites are installed at the default location (/usr/local) and are accessible by all users. For the default location, sudo is required to run the script. You can also specify a different location using ./tools/install_prereqs.sh $custom_location.
Go to the project folder and create a build directory:
```
mkdir build
cd build
```
Configure CMake. If the prerequisites are installed at the default location /usr/local, use:
```
CXX=/opt/rocm/llvm/bin/clang++ cmake .. -DGPU_TARGETS=$(/opt/rocm/bin/rocminfo | grep -o -m1 'gfx.*')
```
Otherwise, you need to set -DCMAKE_PREFIX_PATH=$your_loc to configure CMake.
Build MIGraphX source code:
```
make -j$(nproc)
```
You can verify this using:
```
make -j$(nproc) check
```
Install MIGraphX libraries:
```
make install
```

Use Docker

The easiest way to set up the development environment is to use Docker.

With the Dockerfile, build a Docker image:
```
    docker build -t migraphx .
```

Enter the development environment using docker run:

    docker run --device='/dev/kfd' --device='/dev/dri' -v=`pwd`:/code/AMDMIGraphX -w /code/AMDMIGraphX --group-add video -it migraphx

In the Docker container, all required prerequisites are already installed, so you can go to the folder /code/AMDMIGraphX and follow the steps (starting from 2) in the Use CMake to build MIGraphX.

Using the MIGraphX Python module

To use MIGraphX's Python module, you can set PYTHONPATH or use the .deb package:

Setting PYTHONPATH:

export PYTHONPATH=/opt/rocm/lib:$PYTHONPATH

Creating the deb package:
```
make package
```
This provides the path for .deb package.

To install:
```
dpkg -i <path_to_deb_file>
```

Calling MIGraphX APIs

To use MIGraphX's C/C++ API in your CMake project, you must set CMAKE_PREFIX_PATH to the MIGraphX installation location and run:

find_package(migraphx)
target_link_libraries(myApp migraphx::c)

Where myApp is the CMake target in your project.

Building for development

Using rbuild, you can install the dependencies for development with:

rbuild develop -DGPU_TARGETS=$(/opt/rocm/bin/rocminfo | grep -o -m1 'gfx.*')

This installs development dependencies in the deps directory and configures cmake to use those dependencies in the build directory. You can change these directories by passing the --deps-dir and --build-dir flags to the rbuild command:

rbuild develop --build-dir build_rocm_55 --deps-dir /home/user/deps_dir

Building the documentation

HTML and PDF documentation can be built using:

cmake --build . --config Release --target doc OR make doc

This will build a local searchable web site inside the docs/html folder.

Documentation is built using Doxygen and rocm-docs-core

Run the steps below to build documentation locally.

cd docs

pip3 install -r sphinx/requirements.txt

python3 -m sphinx -T -E -b html -d _build/doctrees -D language=en . _build/html

Depending on your setup sudo may be required for the pip install.

Formatting the code

All the code is formatted using clang-format. To format a file, use:

clang-format-10 -style=file -i <path-to-source-file>

Also, githooks can be installed to format the code per-commit:

./.githooks/install

amdmigraphx's People

Contributors

Stargazers

Watchers

Forkers

wrightkennethj krishnateja95 aaronenyeshi pruthvistony nelsonc-amd yxsamliu mhbliao f0rmiga clockfly turneram pfultz2 sjw36 rocmsupport longyap feiyunwill parallelo umangyadav yjwong orbitcowboy causten ashutom barathum000 jungpark-mlir llmhao bryanloz-xilinx emankov eastpearlstar2019 varunsh-xilinx migraphx-benchmark zstreet87 superzerg aayushmaanjain tedthemistokleous jiaoyuming1 billjao igormirosavljevichtec jz-htec tpkessler manupak maawad arvindcheru fsx950223 apwojcik samjwu gyulaz-htec trixirt ravil-mobile lajagapp jiajiduan applib-sg nishiy-eksdee nbolade-224 applib-sg-1009 rmalavally skytodmoon schrodingerzhu almahmud29 swraw ikalinic abhimeda jamestiotio jasberc 2440020096 idenoh aditya-167 raingrey amd-maheshs3 eddieliao fjankovi wonkonotsane yiakwy-xpu-ml-framework-team stsokolo raramakr shahamed swathi9494

amdmigraphx's Issues

Add installation and packaging

Add GlobalAveragePool

Rename to MIGraph

Tasks for July

This is a ToDo lists to finish for July. Instead of creating an issue for each task, lets keep one for each release or month and discuss them
@adityaatluri

Add batch norm cpu implementation for both inference and training along with tests.
Add GPU kernel for batch norm for both inference and training along with tests.
Add AMD copyright to all files and MIT license

@pfultz2

@wsttiger

Make broadcast op unary

The shape should be a field in the op.

Add pattern matching for optimization passes

Insert a pass to copy literals to the GPU

Add attribute for aliased output

To keep better track of usage, it would be nice to have an attribute to know when the output of an operator is an alias of the input argument.

For many operators in miopen, the output buffer gets passed in as a parameter. It would be good to know that the argument result is an alias to this parameter.

Tensor view should work non-standard shapes

Specify GPU device

onnx model zoo resnet50 uses operators not supported in MIGraph

To reproduce:

Download the resnet50 tarfile from https://github.com/onnx/models/tree/master/resnet50
Run src/read_onnx on the model.onnx file from package above

MIGraph prints the graph as it is read in:
./src/onnx/read_onnx resnet50/model.onnx | grep unknown
@284 = unknown:Sum(@281,@283) -> float_type, {1, 256, 56, 56}, {802816, 3136, 56, 1}
@294 = unknown:Sum(@293,@285) -> float_type, {1, 256, 56, 56}, {802816, 3136, 56, 1}
@304 = unknown:Sum(@303,@295) -> float_type, {1, 256, 56, 56}, {802816, 3136, 56, 1}
@316 = unknown:Sum(@313,@315) -> float_type, {1, 512, 28, 28}, {401408, 784, 28, 1}
@326 = unknown:Sum(@325,@317) -> float_type, {1, 512, 28, 28}, {401408, 784, 28, 1}
@336 = unknown:Sum(@335,@327) -> float_type, {1, 512, 28, 28}, {401408, 784, 28, 1}
@346 = unknown:Sum(@345,@337) -> float_type, {1, 512, 28, 28}, {401408, 784, 28, 1}
@358 = unknown:Sum(@355,@357) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@368 = unknown:Sum(@367,@359) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@378 = unknown:Sum(@377,@369) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@388 = unknown:Sum(@387,@379) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@398 = unknown:Sum(@397,@389) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@408 = unknown:Sum(@407,@399) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@420 = unknown:Sum(@417,@419) -> float_type, {1, 2048, 7, 7}, {100352, 49, 7, 1}
@430 = unknown:Sum(@429,@421) -> float_type, {1, 2048, 7, 7}, {100352, 49, 7, 1}
@440 = unknown:Sum(@439,@431) -> float_type, {1, 2048, 7, 7}, {100352, 49, 7, 1}
@448 = unknown:Softmax(@447) -> float_type, {1, 1000}, {1000, 1}

Fix fwd conv+batchnorm fusion

Create tensor descriptors for miopen during compile and not eval

Use rocBLAS for gemms

Add address sanitizer for cpu-based tests

Integrate miopen fusion kernels

Move hip operators to their own header and .cpp file

Add algebra simplification

Add common subexpression elimination

This actually dependent on #49.

Remove activation op

And replace with the relu operator instead.

Compute convolution shape using TF padding mode

Improve coverage of unit tests for memory coloring

Here is the latest coverage reports for:

There are several areas that seem important to have code coverage. Ideally we should have full coverage for memory_coloring_impl::allocate and memory_coloring_impl::build(although we can skip empty programs).

The check for invalid offsets in rewrite seem important to have coverage for as well, unless we want to make that check it an assert. We dont need coverage for the unify_literals check.

Also, it would be good to have coverage for the ordering. Especially, since the last else uses > instead of <, here. Its not at all clear why the operator changes so a test demonstrating its usefulness would be good.

Add equality operator to operators

This is necessary for CSE.

Move operators to a op namespace

Release 0.1 definition

GOAL:

Initial demonstration for SC '18, available outside team

IS/IS NOT:

IS single GPU, IS NOT multiple GPU
IS inference, IS NOT training
IS ONNX file input IS NOT framework integration

Goals:

Dialect goal
Enumerated list of models (model zoo): resnet50, inception, mobilenet, mnist, yolov3

Performance goal
resnet50 - faster than TF
measuring performance on enumerated models
measure memory improvement - run with and without pass

Delivery
Timing soon enough to great demo (from outside the team)
Repo is public and tagged

Task areas [can remove these as we have issues tagged]

Quality Assurance

Fix cpu verification of onnx files
- Better reducing onnx file

Add a pass to transform broadcast strides to leading ones

Add constant propogation

Add support for fp16

Add batch norm support

Add a dummy operator for batch norm, computing just the shapes.
Add batch norm for cpu backend.
Test added batch norm
Add MIOpen batch norm support

Add LeakyRelu

ONNX export from pytorch 0.4.1 uses dynamic shapes

Currently, PyTorch 4.1 has multiple IR instructions to perform a simple flatten. We need to add a matcher (or some other logic) to detect this and replace with simple flatten.

Call findconvolution once

Use coloring algorithm to optimize out memory allocations

Add Gemm

There are two ONNX operators corresponding to matrix multiplication -MatMul and Gemm. We have implemented MatMul but we are not reading in Gemm which results in unknown operator. Add Gemm to frontend ONNX parser. Also, Gemm needs a few more parameters - transA, transB, alpha, beta.

Add tracing when running

Add a destroy operator

Add a pass to remove duplicate outlines

Inline version namespace

Add checking of onnx files on jenkins

Add a pass to optimize chain of additions

(x + a) + (y + b) => x + y + (a + b)

Provide a get_operator function

This can get the operator from the instruction_ref.

Test failure observed with ubuntu 18.04 + rocm 1.8 and MIGraph latest (2018-09-19)

Unit test error:
The following tests FAILED:
17 - test_gpu_miopen (Failed)
Errors while running CTest

Configuration information:

Built in a docker container from the "Dockerfile" definition in MIGraph sources.
env CXX=/opt/rocm/hcc/bin/hcc cmake ..
make
make check

*** This is not rocm 1.9 (mis-spoke in meeting; I installed rocm 1.9 on my system but this was run in the container which uses rocm 1.8) ***
/etc/issue
Ubuntu 18.04.1 LTS \n \l
uname -a
Linux yarumal 4.15.0-34-generic #37-Ubuntu SMP Mon Aug 27 15:21:48 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

As a separate exercise I will try outside the docker container which does have rocm-dev 1.9.211 installed.

Add a pass to insert contiguous operators

We should always insert contiguous operator for non-packed tensors, but then add another pass to remove such operators when possible.

add operand alias to operators

need to find a way to annotate operand-alias in operators/operations/instructions so that program analysis does not need to add specialized checks for each operators.

Make compute method optional

Improve error message when missing a paramter in the program

Add a context object to compute

For backends like miopen, we need to add pass an miopen handle. Currently this is passed through the argument class. This requires shape to have an any_type, which is rather ambiguous.

Instead the compute method could take a context object that stores the miopen handle. Since the context object is target-dependent, this can be set up by the target during compile.

We may need a mechanism to set the context from the target for programs that are already compiled. Perhaps an overload of eval that takes a target.

Add 0.1 release documentation

Items to include

sample usage
known limitations
supported vs. unsupported items

Failures reading ONNX files

This issue partially to document what I see trying model zoo models and partially to sync on specific versions. Here is what I see trying each of the five designated models with code as of 10/1:

resnet50 - using https://github.com/onnx/models/tree/master/resnet50, release 1.3

Produces list of nodes w/o complaint

inception_v2 - using https://github.com/onnx/models/tree/master/inception_v2, release 1.3

@481 = @literal{ ... } -> float_type, {64}, {1}
@482 = @literal{ ... } -> float_type, {64}, {1}
@483 = @literal{ ... } -> float_type, {64}, {1}
@484 = @literal{ ... } -> float_type, {64}, {1}
@485 = @literal{ ... } -> float_type, {64, 3, 7, 7}, {147, 49, 7, 1}
data_0 = @param:data_0 -> float_type, {1, 3, 224, 224}, {150528, 50176, 224, 1}
@487 = convolutionpadding={3, 3}, stride={2, 2}, dilation={1, 1} -> float_type, {1, 64, 112, 112}, {802816, 12544, 112, 1}
@488 = batch_norm_inference(@487,@484,@483,@482,@481) -> float_type, {1, 64, 112, 112}, {802816, 12544, 112, 1}
@489 = unknown:Unsqueeze(@480) -> float_type, {64}, {1}

terminate called after throwing an instance of 'migraph::exception'
what(): /home/mev/source/MIGraph/src/include/migraph/check_shapes.hpp:66: Dimensions do not match
Aborted (core dumped)

mobilenet - using https://github.com/onnx/models/tree/master/models/image_classification/mobilenet

Passes without complaint, but does give one unknown operator:

mev@cafayate:~/source/MIGraph/build$ ./src/onnx/read_onnx /home/mev/dockerx/models/mobilenetv2-1.0/mobilenetv2-1.0.onnx | grep unknown
@420 = unknown:GlobalAveragePool(@419) -> float_type, {1, 1280, 7, 7}, {62720, 49, 7, 1}

mnist - using https://github.com/onnx/models/tree/master/mnist, onnx version 1.2

@0 = @literal{-0.044856, 0.00779166, 0.0681008, 0.0299937, -0.12641, 0.140219, -0.0552849, -0.0493838, 0.0843221, -0.0545404} -> float_type, {1, 10}, {10, 1}
@1 = @literal{256, 10} -> int64_type, {2}, {1}
@2 = @literal{ ... } -> float_type, {16, 4, 4, 10}, {160, 40, 10, 1}
@3 = @literal{1, 256} -> int64_type, {2}, {1}
@4 = @literal{ ... } -> float_type, {16, 1, 1}, {1, 1, 1}
@5 = @literal{ ... } -> float_type, {16, 8, 5, 5}, {200, 25, 5, 1}
@6 = @literal{-0.16154, -0.433836, 0.0916414, -0.0168522, -0.0650264, -0.131738, 0.0204176, -0.12111} -> float_type, {8, 1, 1}, {1, 1, 1}
@7 = @literal{ ... } -> float_type, {8, 1, 5, 5}, {25, 25, 5, 1}
Input3 = @param:Input3 -> float_type, {1, 1, 28, 28}, {784, 784, 28, 1}
@9 = convolutionpadding={0, 0}, stride={1, 1}, dilation={1, 1} -> float_type, {1, 8, 24, 24}, {4608, 576, 24, 1}

terminate called after throwing an instance of 'migraph::exception'
what(): /home/mev/source/MIGraph/src/include/migraph/check_shapes.hpp:66: Dimensions do not match
Aborted (core dumped)

yolov3 (couldn't find v3 in model zoo, tried https://github.com/onnx/models/tree/master/tiny_yolov2 version 1.2

Some unknown operators but no crash
mev@cafayate:~/source/MIGraph/build$ ./src/onnx/read_onnx /home/mev/dockerx/models/tiny_yolov2/model.onnx | grep unknown
[libprotobuf INFO google/protobuf/io/coded_stream.cc:610] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 63480670
@43 = unknown:ImageScaler(image) -> float_type, {0, 3, 416, 416}, {519168, 173056, 416, 1}
@46 = unknown:LeakyRelu(@45) -> float_type, {0, 16, 414, 414}, {2742336, 171396, 414, 1}
@50 = unknown:LeakyRelu(@49) -> float_type, {0, 32, 205, 205}, {1344800, 42025, 205, 1}
@54 = unknown:LeakyRelu(@53) -> float_type, {0, 64, 100, 100}, {640000, 10000, 100, 1}
@58 = unknown:LeakyRelu(@57) -> float_type, {0, 128, 48, 48}, {294912, 2304, 48, 1}
@62 = unknown:LeakyRelu(@61) -> float_type, {0, 256, 22, 22}, {123904, 484, 22, 1}
@66 = unknown:LeakyRelu(@65) -> float_type, {0, 512, 9, 9}, {41472, 81, 9, 1}
@70 = unknown:LeakyRelu(@69) -> float_type, {0, 1024, 6, 6}, {36864, 36, 6, 1}
@73 = unknown:LeakyRelu(@72) -> float_type, {0, 1024, 4, 4}, {16384, 16, 4, 1}

Add scheduling pass

When instruction ordering is fixed, memory coloring can only do a limited job to reduce memory footprint. Add a pass to reorder instructions, which also has a potential to interleave computation and memory copy to improve throughput.

rocm / amdmigraphx Goto Github PK

amdmigraphx's Introduction

AMD MIGraphX

Installing from binaries

Building from source

Build prerequisites

Use the ROCm build tool rbuild.

Use CMake to build MIGraphX

Use Docker

Using the MIGraphX Python module

Calling MIGraphX APIs

Building for development

Building the documentation

Formatting the code

amdmigraphx's People

Contributors

Stargazers

Watchers

Forkers

amdmigraphx's Issues

GOAL:

IS/IS NOT:

Goals:

Task areas [can remove these as we have issues tagged]

Quality Assurance

resnet50 - using https://github.com/onnx/models/tree/master/resnet50, release 1.3

inception_v2 - using https://github.com/onnx/models/tree/master/inception_v2, release 1.3

mobilenet - using https://github.com/onnx/models/tree/master/models/image_classification/mobilenet

mnist - using https://github.com/onnx/models/tree/master/mnist, onnx version 1.2

yolov3 (couldn't find v3 in model zoo, tried https://github.com/onnx/models/tree/master/tiny_yolov2 version 1.2

Recommend Projects

Recommend Topics

Recommend Org