spcl / daceml Goto Github PK
View Code? Open in Web Editor NEWA Data-Centric Compiler for Machine Learning
Home Page: https://daceml.readthedocs.io
License: BSD 3-Clause "New" or "Revised" License
A Data-Centric Compiler for Machine Learning
Home Page: https://daceml.readthedocs.io
License: BSD 3-Clause "New" or "Revised" License
Try to remove Onehot from bert full model test
When loading ONNX files, we should check that the opset version matches the one we support
in order to save parameters for individual modules and avoid issues with PyTorch and 64 allowed parameters per module call.
Alternatively: mode that runs with ctypes/pybind11 when too many parameters are present.
and other transformers
I am getting the following errors when I run the example/plot_fpga_lenet.py using the current master branch:
$ python examples/plot_fpga_lenet.py
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONRelu_11" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONMaxPool_12" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "input" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONRelu_14" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONMaxPool_15" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONReshape_16" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONGemm_18" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONRelu_19" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONGemm_20" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONRelu_21" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "onnxCOLONCOLONGemm_22" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/validation.py:321: UserWarning: WARNING: Use of uninitialized transient "x" in state TestLeNet
warnings.warn('WARNING: Use of uninitialized transient "%s" in state %s' %
Traceback (most recent call last):
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/compiler.py", line 222, in configure_and_compile
_run_liveoutput("cmake --build . --config %s" % (Config.get('compiler', 'build_type')),
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/compiler.py", line 405, in _run_liveoutput
raise subprocess.CalledProcessError(process.returncode, command, output.getvalue())
subprocess.CalledProcessError: Command 'cmake --build . --config RelWithDebInfo' returned non-zero exit status 2.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/scripts/plot_fpga_lenet.py", line 76, in <module>
daceml_result = daceml_module(x)
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/daceml/torch/module.py", line 385, in forward
self.function = self._initialize_sdfg(actual_inputs)
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/daceml/torch/module.py", line 355, in _initialize_sdfg
self.compiled_function = function_generator(self, dummy_inputs)
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/daceml/torch/dispatchers/cpp_torch_extension.py", line 498, in register_and_compile_torch_extension
compiled, handle_ptr = compile_and_init_sdfgs(module, dummy_inputs)
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/daceml/torch/dispatchers/common.py", line 47, in compile_and_init_sdfgs
compiled: CompiledSDFG = module.dace_model.compile_and_init()
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/daceml/onnx/onnx_importer.py", line 452, in compile_and_init
compiled_sdfg = self.sdfg.compile()
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/sdfg/sdfg.py", line 2140, in compile
shared_library = compiler.configure_and_compile(program_folder, sdfg.name)
File "/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/compiler.py", line 231, in configure_and_compile
raise cgx.CompilationError('Compiler failure:\n' + ex.output)
dace.codegen.exceptions.CompilationError: Compiler failure:
Consolidate compiler generated dependencies of target TestLeNet_1
[ 16%] Building CXX object CMakeFiles/TestLeNet_1.dir/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/cpu/TestLeNet_1.cpp.o
In file included from /u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/../runtime/include/dace/dace.h:14,
from /afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/cpu/TestLeNet_1.cpp:2:
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/../runtime/include/dace/types.h: In constructor ‘dace::half::half(float)’:
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/../runtime/include/dace/types.h:94:28: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
94 | uint32_t x = *((uint32_t*)&f);
| ~^~~~~~~~~~~~~~
[ 33%] Building CXX object CMakeFiles/TestLeNet_1.dir/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/xilinx/device/TestLeNet_0_0.cpp.o
In file included from /u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/../runtime/include/dace/copy.h:5,
from /u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/../runtime/include/dace/xilinx/device.h:8,
from /u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/../runtime/include/dace/fpga_device.h:5,
from /afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/xilinx/device/TestLeNet_0_0.cpp:1:
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/../runtime/include/dace/types.h: In constructor ‘dace::half::half(float)’:
/u1/ruckman/anaconda3/envs/dace-ml-dev/lib/python3.9/site-packages/dace/codegen/../runtime/include/dace/types.h:94:28: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
94 | uint32_t x = *((uint32_t*)&f);
| ~^~~~~~~~~~~~~~
/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/xilinx/device/TestLeNet_0_0.cpp: In function ‘void TestLeNet_0_0_0(const float*, const float*, const float*, const float*, const float*, const float*, const float*, const float*, const float*, const float*, const float*, const float*, const long long int*, float*, float*, float*, float*, float*, float*, float*, float*, float*, float*, float*, float*, float*)’:
/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/xilinx/device/TestLeNet_0_0.cpp:575:243: error: ‘y_out’ was not declared in this scope
575 | if (((((b > 0) || (n0 > 0)) && (k_drain < p) && (m_drain < 576)) || ((k == (25 - 1)) && (m >= 0)) || (__bn0km_drain && (k_drain < p)))) {Y_pipe[p].push((((p == 0) || ((k_drain == (25 - 1)) && (! __bn0km_drain))) ? y_out[0] : forward_in.pop()));
| ^~~~~
/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/xilinx/device/TestLeNet_0_0.cpp:843:263: error: ‘y_out’ was not declared in this scope
843 | if (((((b > 0) || (n0 > 0)) && (k_drain < p) && (m_drain < 64)) || ((k == (150 - 1)) && (m >= 0)) || (__bn0km_drain && (k_drain < p)))) {fpga_im2col_conv_1_Y_pipe[p].push((((p == 0) || ((k_drain == (150 - 1)) && (! __bn0km_drain))) ? y_out[0] : forward_in.pop()));
| ^~~~~
/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/xilinx/device/TestLeNet_0_0.cpp:1086:248: error: ‘c_out’ was not declared in this scope; did you mean ‘b_out’?
1086 | if (((((n0 > 0) || (tm > 0)) && (k_drain < p) && (m_drain < 120)) || ((k == (256 - 1)) && (m >= 0)) || (__n0tmkm_drain && (k_drain < p)))) {C_pipe[p].push((((p == 0) || ((k_drain == (256 - 1)) && (! __n0tmkm_drain))) ? c_out[0] : forward_in.pop()));
| ^~~~~
| b_out
/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/xilinx/device/TestLeNet_0_0.cpp:1199:259: error: ‘c_out’ was not declared in this scope; did you mean ‘b_out’?
1199 | if (((((n0 > 0) || (tm > 0)) && (k_drain < p) && (m_drain < 84)) || ((k == (120 - 1)) && (m >= 0)) || (__n0tmkm_drain && (k_drain < p)))) {fpga_gemm_1_C_pipe[p].push((((p == 0) || ((k_drain == (120 - 1)) && (! __n0tmkm_drain))) ? c_out[0] : forward_in.pop()));
| ^~~~~
| b_out
/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/xilinx/device/TestLeNet_0_0.cpp:1312:257: error: ‘c_out’ was not declared in this scope; did you mean ‘b_out’?
1312 | if (((((n0 > 0) || (tm > 0)) && (k_drain < p) && (m_drain < 10)) || ((k == (84 - 1)) && (m >= 0)) || (__n0tmkm_drain && (k_drain < p)))) {fpga_gemm_2_C_pipe[p].push((((p == 0) || ((k_drain == (84 - 1)) && (! __n0tmkm_drain))) ? c_out[0] : forward_in.pop()));
| ^~~~~
| b_out
gmake[2]: *** [CMakeFiles/TestLeNet_1.dir/build.make:90: CMakeFiles/TestLeNet_1.dir/afs/slac.stanford.edu/u/re/ruckman/projects/daceml-dev/software/.dacecache/TestLeNet_1/src/xilinx/device/TestLeNet_0_0.cpp.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:630: CMakeFiles/TestLeNet_1.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
Hello, I can't get dataset from http://spclstorage.inf.ethz.ch/, same issue as ncc
Since Im2col has a ONNX node within a map, the schedule of the onnx node is inferred to be Sequential (since the schedule of the outer map is Sequential).
Then the ONNX expansion fails because there is not Sequential implementation of Gemm
Goal: get the bert encoder to run with MKL and CUBLAS
Here is my problem while run the example/plot_cuda_mish.py
why this problem occur ?
Could you please help me to fix this?
File "/home/daceml/daceml/venv/lib/python3.8/site-packages/dace/codegen/compiler.py", line 227, in configure_and_compile
_run_liveoutput("cmake --build . --config %s" % (Config.get('compiler', 'build_type')),
File "/home/daceml/daceml/venv/lib/python3.8/site-packages/dace/codegen/compiler.py", line 410, in _run_liveoutput
raise subprocess.CalledProcessError(process.returncode, command, output.getvalue())
subprocess.CalledProcessError: Command 'cmake --build . --config RelWithDebInfo' returned non-zero exit status 2.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "plot_cuda_mish.py", line 192, in <module>
dace_output = dace_mish(dace_input)
File "/home/daceml/daceml/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/daceml/daceml/daceml/torch/module.py", line 386, in forward
self.function = self._initialize_sdfg(actual_inputs)
File "/home/daceml/daceml/daceml/torch/module.py", line 354, in _initialize_sdfg
self.compiled_function = function_generator(self, dummy_inputs)
File "/home/daceml/daceml/daceml/torch/dispatchers/cpp_torch_extension.py", line 481, in register_and_compile_torch_extension
compiled, handle_ptr, compiled_bwd, bwd_handle_ptr = compile_and_init_sdfgs(
File "/home/daceml/daceml/daceml/torch/dispatchers/common.py", line 80, in compile_and_init_sdfgs
compiled_bwd: CompiledSDFG = module.backward_sdfg.compile()
File "/home/daceml/daceml/venv/lib/python3.8/site-packages/dace/sdfg/sdfg.py", line 2141, in compile
shared_library = compiler.configure_and_compile(program_folder, sdfg.name)
File "/home/daceml/daceml/venv/lib/python3.8/site-packages/dace/codegen/compiler.py", line 236, in configure_and_compile
raise cgx.CompilationError('Compiler failure:\n' + ex.output)
dace.codegen.exceptions.CompilationError: Compiler failure:
[ 20%] Building NVCC (Device) object CMakeFiles/cuda_compile_1.dir/__/__/__/__/__/__/examples/.dacecache/DaCeMish_backward/src/cuda/cuda_compile_1_generated_DaCeMish_backward_0_cuda.cu.o
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/home/daceml/daceml/venv/lib/python3.8/site-packages/dace/codegen/../runtime/include/dace/math.h(504): error: no operator "*=" matches these operands
operand types are: dace::vec<float, 4U> *= const dace::vec<float, 4U>
detected during instantiation of "T dace::math::ipow(const T &, const unsigned int &) [with T=dace::vec<float, 4U>]"
/home/daceml/daceml/examples/.dacecache/DaCeMish_backward/src/cuda/DaCeMish_backward_0_cuda.cu(90): here
1 error detected in the compilation of "/home/daceml/daceml/examples/.dacecache/DaCeMish_backward/src/cuda/DaCeMish_backward_0_cuda.cu".
CMake Error at cuda_compile_1_generated_DaCeMish_backward_0_cuda.cu.o.RelWithDebInfo.cmake:276 (message):
Error generating file
/home/daceml/daceml/examples/.dacecache/DaCeMish_backward/build/CMakeFiles/cuda_compile_1.dir/__/__/__/__/__/__/examples/.dacecache/DaCeMish_backward/src/cuda/./cuda_compile_1_generated_DaCeMish_backward_0_cuda.cu.o
gmake[2]: *** [CMakeFiles/DaCeMish_backward_0.dir/build.make:568: CMakeFiles/cuda_compile_1.dir/__/__/__/__/__/__/examples/.dacecache/DaCeMish_backward/src/cuda/cuda_compile_1_generated_DaCeMish_backward_0_cuda.cu.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:84: CMakeFiles/DaCeMish_backward_0.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
Something like
C[:] = ONNXAdd(A, B)
instead of
ONNXAdd(A=A, B=B, C=C)
This would require renaming the current ONNX nodes to something else, maybe like ONNXAddNode. (so that we can still have to old behaviour for fallback)
Pow returns NaN in the PT bert encoder test; check how ORT implements it
A memlet validation error happens, which means that the pure backward implementation needs to be adapted
Every run the code seems to recompile again.
I am trying to install daceML on my machine with the following specifiactions:
OS: Ubuntu 20.04
Arch: x86
Cuda: 12.2
Python: 3.9
pytorch: 1.8
dace: v0.13.3
The steps I followed:
-> Install relevant versions of cuda, python, dace, pytorch
-> Cloned the patched version of ONNX from: https://github.com/orausch/onnxruntime.git
-> Build using the following command (modified for CUDA 12.2)
"./build.sh --use_cuda --cuda_version=12.2 --cuda_home=/usr/local/cuda-12.2 --cudnn_home=/usr/local/cuda-12.2 --build_shared_lib --parallel --config Release"
The build fails with this. Please let me know if you need any further information.
Please let us know if there is any incompatibility with any of these prerequisite software/libraries.
Broken in full_bert because of a GPU scalar cast spcl/dace#360
These expressions could become reduce nodes.
Add documentation about debug level and default_implementation
Likely due to bad support for subsets.
Related to #28.
Instead of cluttering the .dacecache, it would be better to compile the bridge upon installation or first use.
I'm trying to externally add and register a custom implementation for an ONNX op. For the sake of context, the op in question is ONNXMul
. I've tried following the code snippet in the documentation, and I've come up with this:
@op_implementation(op="Mul", name="myimpl")
class FPGAMul(ONNXForward):
@staticmethod
def forward_can_be_applied(node: ONNXOp, state: SDFGState, sdfg: SDFG) -> bool:
...
@staticmethod
def forward(node: ONNXOp, state: SDFGState, sdfg: SDFG) -> typing.Union[Node, SDFG]:
...
daceml.onnx.default_implementation = 'myimpl'
This piece of code resides in the same Python script that loads an ONNX model and expands the library nodes. Namely, somewhere later in the script I have
model = onnx.load(model_path)
dace_model = daceml.onnx.ONNXModel(name, model)
print ('ONNX model loaded...')
dace_model.sdfg.expand_library_nodes()
However this results in an error:
Traceback (most recent call last):
File "load.py", line 114, in predict_daceml
dace_model.sdfg.expand_library_nodes()
File ".../dace/sdfg/sdfg.py", line 2559, in expand_library_nodes
impl_name = node.expand(self, state)
File ".../dace/sdfg/nodes.py", line 1269, in expand
raise KeyError("Unknown implementation for node {}: {}".format(type(self).__name__, implementation))
KeyError: 'Unknown implementation for node ONNXMul: myimpl'
Can you tell me why my implementation is not registering, and how to fix this?
Which version of dace do I need to compile this with? With the latest version of dace (0.14.4) I get this error:
ModuleNotFoundError: No module named 'dace.codegen.targets.common'
Check test_nested_gradient_summation
Both the ORT expansion and the pure expansions ignore the input/output subsets and always operate on the full array specified by the memlet.
For the pure expansions, fixing this should amount to fixing in_desc_with_name
, out_desc_with_name
and input_prog_for_node
.
For onnxruntime expansions, we will need to copy before passing in the tensor if the tensor is not contiguous
The reshape and CF transformations will also need to be checked
The ONNX_ prefix is added twice, so arrays have names like ONNX_ONNX_...
Also stop running duplicate tests on pauli
Otherwise, parallelism opportunities are missed as blockDim.x is usually small on its own.
At the moment, creating ONNX library node implementations entails a double nested loop over all expansions, which is hard to read and debug. If we could avoid dynamic class creation and the double for loop, or at least clean it up, it would be more approachable.
For better autocomplete
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.