cansik / onnxruntime-silicon Goto Github PK

View Code? Open in Web Editor NEW

163.0 3.0 16.0 13 KB

ONNX Runtime prebuilt wheels for Apple Silicon (M1 / M2 / M3 / ARM64)

License: MIT License

Shell 54.93% Python 45.07%

apple macos onnx onnxruntime silicon

onnxruntime-silicon's Introduction

ONNX Runtime for Apple Silicon

ONNX Runtime prebuilt wheels for Apple Silicon (M1 / M2 / arm64)

The official ONNX Runtime now contains arm64 binaries for MacOS as well, but they do only support the CPU backend. This version adds the CoreML backend with version v1.13.0.

Install

To install the prebuilt packages, use the following command to install. The package is called onnxruntime-silicon but is a drop-in-replacement for the onnxruntime package.

pip install onnxruntime-silicon

Build

To build the libraries yourself, please first install the following dependencies and run the build script.

brew install wget cmake protobuf git git-lfs

./build-macos.sh

The pre-built wheel packages should be in the dist directory.

FAQ

Installation

pip install onnxruntime-silicon returns the following error: Could not find a version that satisfies the requirement onnxruntime-silicon

This indicates either that the Python version is not supported (currently only 3.8, 3.9, 3.10, 3.11) or that the python installation is not built for arm64. You can check this by running the following command:

file $(which python) | grep -q arm64 && echo "Python for arm64 found" || echo "Python for arm64 has not been found"

Import ONNX Runtime

import onnxruntime reaises the exception: ModuleNotFoundError: No module named 'onnxruntime'

It seems that onnxruntime has not been installed yet, please run python -m pip install onnxruntime-silicon. Check if it has been installed correctly with the following command:

python -m pip freeze | grep -q onnxruntime-silicon && echo "ONNX runtime for arm64 found" || echo "No ONNX runtime for arm64 found"

Import ONNX Runtime Silicon

import onnxruntime-silicon raises the exception: ModuleNotFoundError: No module named 'onnxruntime-silicon'

onnxruntime-silicon is a dropin-replacement for onnxruntime. After installing the package, everything works the same as with the original onnxruntime. Import the package like this: import onnxruntime.

Another Issue

If your specific issue is not answered by the FAQ and there is not already an issue solved or open, please open a new issue for it. Provide the following information:

MacOS version and architecture
Python version and architecture
Pip version
ONNX Runtime version

You can also run the following command and copy paste it's output into the issue:

echo ""; \
echo "Operating System: $(uname -s) $(uname -r)"; \
echo "Architecture: $(uname -m)"; \
echo "Python Version: $(python --version 2>&1)"; \
echo "Python Architecture: $(python -c 'import platform; print(platform.architecture()[0])')"; \
echo "Python Executable: $(file $(which python))"; \
echo "PIP Version: $(pip --version | awk '{print $2}')"; \
echo ""

About

onnxruntime-silicon's People

Contributors

Stargazers

Watchers

Forkers

kitoborcom mpottinger traseehq varunoberoi xaviviro rigo-m verback2308 akeboshi1 laclouis5 ironicbo ganbayard 19h kodaneflash ink-splatters johnnynunez

onnxruntime-silicon's Issues

Build error with ORT version 1.15.0

I'm trying to compile onnxruntime-silicon 1.15.0 on my MacBook Pro M1 Pro running macOS Ventura and the process fails with a compilation error. I don't have much information excepted that it may be caused by libprotobuf. I installed the build dependencies using brew beforehand as mentioned in the repo so libprotobuf should be installed.

ModuleNotFoundError: No module named 'onnxruntime-silicon'

Hi,
I just installed onnxruntime-silicon (it appears in my pip list) but struggle to import it. If I just type in 'import onnxruntime-silicon' I get a 'SyntaxError: invalid syntax' probably because of the '-'. I tried using importlib to import the module (by typing onnxsilicon = importlib.import_module('onnxruntime-silicon') ) but I still get an error (ModuleNotFoundError: No module named 'onnxruntime-silicon').

Could you please help me to import the module? Thanks!

No CoreML backend???

I am doing a face model
https://github.com/iperov/DeepFaceLive/releases/download/ZAHAR_LUPIN/Zahar_Lupin.dfm

environment:
onnxruntime-coreml == 1.13.1
onnxruntime-silicon == 1.13.1

device : Apple Silicon M1

python:
import onnx
import onnxruntime as rt

options = rt.SessionOptions()
options.log_severity_level = 0
options.intra_op_num_threads = 4
options.execution_mode = rt.ExecutionMode.ORT_SEQUENTIAL
options.graph_optimization_level = rt.GraphOptimizationLevel.ORT_ENABLE_ALL

onnx

onnxSession = rt.InferenceSession(onnx_model_path, providers=[ {"CoreMLExecutionProvider"}], options)

logcat:
EP Error using [{'CoreMLExecutionProvider'}]
Falling back to ['CPUExecutionProvider'] and retrying.
2023-07-25 11:11:48.215511 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-07-25 11:11:48.215528 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2023-07-25 11:11:48.215535 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
EP Error using [{'CoreMLExecutionProvider'}]
Falling back to ['CPUExecutionProvider'] and retrying.
2023-07-25 11:11:48.215793 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-07-25 11:11:48.215804 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2023-07-25 11:11:48.215810 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
EP Error using [{'CoreMLExecutionProvider'}]
Falling back to ['CPUExecutionProvider'] and retrying.
2023-07-25 11:11:48.222091 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-07-25 11:11:48.222108 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2023-07-25 11:11:48.222116 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
2023-07-25 11:11:48.290081 [I:onnxruntime:, inference_session.cc:1222 Initialize] Initializing session.
2023-07-25 11:11:48.293215 [I:onnxruntime:, reshape_fusion.cc:42 ApplyImpl] Total fused reshape node count: 0
2023-07-25 11:11:48.295481 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295512 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295524 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295533 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295540 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295547 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295555 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295562 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295569 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295577 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295584 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295591 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295599 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295606 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295613 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295620 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295626 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295634 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.296167 [V:onnxruntime:, session_state.cc:1010 VerifyEachNodeIsAssignedToAnEp] Node placements
2023-07-25 11:11:48.296189 [V:onnxruntime:, session_state.cc:1013 VerifyEachNodeIsAssignedToAnEp] All nodes placed on [CPUExecutionProvider]. Number of nodes: 69
2023-07-25 11:11:48.296278 [V:onnxruntime:, session_state.cc:66 CreateGraphInfo] SaveMLValueNameIndexMapping
2023-07-25 11:11:48.296307 [V:onnxruntime:, session_state.cc:112 CreateGraphInfo] Done saving OrtValue mappings.
2023-07-25 11:11:48.296561 [I:onnxruntime:, session_state_utils.cc:199 SaveInitializedTensors] Saving initialized tensors.
2023-07-25 11:11:48.297107 [I:onnxruntime:, session_state_utils.cc:342 SaveInitializedTensors] Done saving initialized tensors

###############
The onruntime-silicon has been installed, but it only supports the CPU backend, not the CoreML backend, I don’t know the reason, can you help me.

Thanks in advance!

ANE Neural engine now working

I'm not yet sure what I did, I messed around with the onnxruntime source to make sure that CPU-only provider flag was changed, casually into require ANE., and eventually got it to build.

Running some face restoration onnx model, I was checking "powermetrics", disapointed to see GPU at 0, with my M2 in low power mode, and then I plugged it to power, and suddenly ANE power usage spiked, maintaining it self around 10%.
I checked with the M1 max, and same, ANE in use topping 20, 25% sometimes.
I had never seen ANE being used before, ceteris paribus besides two things:
either my swapping of every "bicubic", non-core-ml supported operation, into "bilinear", and the runtime I'v been trying to figure out.

In this matter, I came to various conclusions, or hypothesises, one of which I chose the words carefully:
. We should consider that we have only 1 device (eg: M1 pro) that has 3 computing units (CPU, GPU, ANE).
-CoreML is one way to communicate with the device, there are alternatives, like Metal, OpenCL, Vulkan, WebGL, ARM providers)
-In this particular context of ours, onnxruntime, we are at the same level as 3 years ago.
I think there is some sort of confusion since the moment it was decided to limit to CPU. I am not saying it's a feature and not a bug, because it was clearly stated that this decision was for at least offering something that works, because it does work using only the CPU as computing unit, a very well documented CoreML processing unit configuration option.
For converting models using onnx, all computing units are in the code.

So, am I the only one?, Could be an update in something else.
This Neural Engine in use, was it something that happened before that I just realized?
When running "sudo powermetrics" in terminal, is there a uncontestable relation between model inference with correML execution provider and ANE usage?
In Activity Monitor, do you also see for ANE nothing signifcant, besides mucho bytes written?
I used sudo powermetrics --hide-cpu-duty-cycle -i 2000

Also, I compiled CoreML for intel x86_64 macs and runs flawlessly on my mac mini 2018, giving a 10 fold increase in frames per seconds (still slow though)

is amd gpu supported by this?

can it support the amd gpu if i compile this for my x86_64 macos?

Please release 1.17.0

Hey my friend,

please release onnxruntime-silicon==1.17.0

I guess it's the perfect opportunity to thank for all the efforts in the past months.

Building from source fails on M2 Pro

Python 3.9.16
I tried to build from source by running ./build-macos.sh

[ 23%] Building CXX object external/protobuf/cmake/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/util/type_resolver_util.cc.o
[ 23%] Building CXX object external/protobuf/cmake/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/wire_format.cc.o
[ 23%] Building CXX object external/protobuf/cmake/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/wrappers.pb.cc.o
[ 23%] Linking CXX static library libprotobuf.a
[ 23%] Built target libprotobuf
make: *** [all] Error 2
Traceback (most recent call last):
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 2812, in <module>
    sys.exit(main())
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 2727, in main
    build_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target)
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 1349, in build_targets
    run_subprocess(cmd_args, env=env)
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 740, in run_subprocess
    return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/python/util/run.py", line 49, in run
    completed_process = subprocess.run(
  File "/opt/homebrew/Cellar/[email protected]/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/opt/homebrew/bin/cmake', '--build', '/Users/X/git/onnxruntime-silicon/onnxruntime/build/MacOS/Release', '--config', 'Release', '--', '-j8']' returned non-zero exit status 2.

performance drop on macos 14

Thanks for the great work. I`ve been using this since ort-1.13, on a MBP with M1 Pro chip.

The problem is, after I updated my system from macos 13 to 14, all the models using coreml EP become slower than before the updating (still faster than using cpu EP though). The performance drop ~50-75% in average. I didn't make a time machine backup before updating, so it's not a good idea to downgrade the system back to 13.

I've made it to manually build a wheel instead of pip install onnxruntime-silicon, but the performance remains the same.

Would you support macos 14 recently?

Does it work with python?

Awesome project, thanks!
Do you actually get it working? Officially the ONNX CoreML provider is not supporting python API.
Indeed after rebuilding the ort.get_available_providers() only returns the CPU one..

Thanks in advance!

[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Failed to find kernel for Split(18) (node Split). Kernel not found

I'm trying to run inferences on Depth Anything ONNX models on macOS M1 in python with onnxruntime-silicon. The models are converted from pth to ONNX.

I can run the inference on CPU but I got the following error while running on GPU:

onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Failed to find kernel for Split(18) (node Split). Kernel not found

Here is my python code:

import cv2
import numpy as np
import onnxruntime
from depth_anything.util.transform import load_image
import os

def postprocess(depth, original_size):
    # Resize, apply a color map and save the depth map
    depth = cv2.resize(depth[0, 0], original_size)
    depth = (depth - depth.min()) / (depth.max() - depth.min()) * 255.0
    depth = depth.astype(np.uint8)
    depth_color = cv2.applyColorMap(depth, cv2.COLORMAP_INFERNO)
    cv2.imwrite("depth.jpg", depth_color)

def run_onnx(image_path, model_path):
    # Preprocess the image
    image, (orig_h, orig_w) = load_image(image_path)
    # Load the model
    session = onnxruntime.InferenceSession(model_path, providers=onnxruntime.get_available_providers()) # ['CPUExecutionProvider'] | onnxruntime.get_available_providers()
    # Run the model
    depth = session.run(None, {"image": image})[0]
    # Save depth map
    postprocess(depth, (orig_w, orig_h))

if __name__ == "__main__":
    os.chdir('/Users/marinnagy/Documents/Programmation/Python/Depth Anything')
    input_img_path = "assets/frame.jpg"
    model_path = "weights/depth_anything_vits14.onnx"
    depth_map = run_onnx(input_img_path, model_path)

I managed to run inferences on MiDaS models without issue, but it seems like Depth Anything use a Split node that is not currently implemented. Any suggestion on how to get around this issue (replace or implement the node?) is welcome, thank you !

PS : I found this thread that talk about a similar error but couldn't really use it for my issue. Also here is the Split operator from the official onnx GitHub repo

Question: How to use onnxruntime-silicon in onnxruntime-nodejs ?

Hi, I'm testing fastembed.js that using onnx (from nodejs), how can I replace the default onnxruntime with onnxruntime-silicon ?

Thanks

ERROR: Could not find a version that satisfies the requirement onnxruntime-silicon (from versions: none)

Hi I have an M2 Mac and I'm getting the same module not found error I get with the main onnx-runtime.

What am I doing wrong?

linux package?

Hi,

I'm trying to use ONNX inside a linux container running on an Mac M1.
Would it make sense to use this package for that? Or should I be able to use another onnxruntime?
Sadly pip refuses to install onnxruntime-silicon because it is not compatible (different OS).

(as a side note: specifying --platform=linux/amd64 for my docker container and using onnxruntime-gpu package instead works, but that requires emulation from qemu for the architecture mismatch and makes things a lot slower)

Possible supports for using "GPU" instead of "CPU" on arm macs(m1/m2/etc) now?

#!/usr/bin/env python3
import onnxruntime as rt

import numpy
from onnxruntime.datasets import get_example

print(rt.get_device())
print(rt.__version__)
print('========')


def test():
    print("running simple inference test...")
    example1 = get_example("sigmoid.onnx")
    sess = rt.InferenceSession(example1, providers=rt.get_available_providers())

    input_name = sess.get_inputs()[0].name
    print("input name", input_name)
    input_shape = sess.get_inputs()[0].shape
    print("input shape", input_shape)
    input_type = sess.get_inputs()[0].type
    print("input type", input_type)

    output_name = sess.get_outputs()[0].name
    print("output name", output_name)
    output_shape = sess.get_outputs()[0].shape
    print("output shape", output_shape)
    output_type = sess.get_outputs()[0].type
    print("output type", output_type)

    import numpy.random

    x = numpy.random.random((3, 4, 5))
    x = x.astype(numpy.float32)
    res = sess.run([output_name], {input_name: x})
    print(res)

def main():
    runtimes = ", ".join(rt.get_available_providers())
    print()
    print(f"Available Providers: {runtimes}")
    print()

    test()

if __name__=="__main__":
    main()

output

CPU
1.16.3
========

Available Providers: CoreMLExecutionProvider, CPUExecutionProvider

running simple inference test...
input name x
input shape [3, 4, 5]
input type tensor(float)
output name y
output shape [3, 4, 5]
output type tensor(float)
[array([[[0.57910156, 0.61865234, 0.5834961 , 0.7050781 , 0.6503906 ],
        [0.64160156, 0.63183594, 0.6098633 , 0.73046875, 0.7211914 ],
        [0.71875   , 0.63964844, 0.5595703 , 0.6591797 , 0.5629883 ],
        [0.5786133 , 0.71435547, 0.56591797, 0.51904297, 0.62353516]],

       [[0.7265625 , 0.5600586 , 0.7290039 , 0.68115234, 0.7109375 ],
        [0.6035156 , 0.61376953, 0.69091797, 0.61279297, 0.55810547],
        [0.52685547, 0.56103516, 0.69921875, 0.5004883 , 0.6533203 ],
        [0.7182617 , 0.66308594, 0.7163086 , 0.58984375, 0.71728516]],

       [[0.546875  , 0.6982422 , 0.58935547, 0.73095703, 0.55371094],
        [0.609375  , 0.6928711 , 0.5371094 , 0.68847656, 0.6147461 ],
        [0.5859375 , 0.72216797, 0.625     , 0.52246094, 0.59716797],
        [0.6777344 , 0.59033203, 0.64941406, 0.6425781 , 0.71191406]]],
      dtype=float32)]

[Process exited 0]

currently only CPU supported.
thanks