cansik / onnxruntime-silicon Goto Github PK

ONNX Runtime prebuilt wheels for Apple Silicon (M1 / M2 / M3 / ARM64)

License: MIT License

Shell 54.93% Python 45.07%

onnxruntime-silicon's Issues

ModuleNotFoundError: No module named 'onnxruntime-silicon'

Hi,
I just installed onnxruntime-silicon (it appears in my pip list) but struggle to import it. If I just type in 'import onnxruntime-silicon' I get a 'SyntaxError: invalid syntax' probably because of the '-'. I tried using importlib to import the module (by typing onnxsilicon = importlib.import_module('onnxruntime-silicon') ) but I still get an error (ModuleNotFoundError: No module named 'onnxruntime-silicon').

Could you please help me to import the module? Thanks!

ANE Neural engine now working

I'm not yet sure what I did, I messed around with the onnxruntime source to make sure that CPU-only provider flag was changed, casually into require ANE., and eventually got it to build.

Running some face restoration onnx model, I was checking "powermetrics", disapointed to see GPU at 0, with my M2 in low power mode, and then I plugged it to power, and suddenly ANE power usage spiked, maintaining it self around 10%.
I checked with the M1 max, and same, ANE in use topping 20, 25% sometimes.
I had never seen ANE being used before, ceteris paribus besides two things:
either my swapping of every "bicubic", non-core-ml supported operation, into "bilinear", and the runtime I'v been trying to figure out.

In this matter, I came to various conclusions, or hypothesises, one of which I chose the words carefully:
. We should consider that we have only 1 device (eg: M1 pro) that has 3 computing units (CPU, GPU, ANE).
-CoreML is one way to communicate with the device, there are alternatives, like Metal, OpenCL, Vulkan, WebGL, ARM providers)
-In this particular context of ours, onnxruntime, we are at the same level as 3 years ago.
I think there is some sort of confusion since the moment it was decided to limit to CPU. I am not saying it's a feature and not a bug, because it was clearly stated that this decision was for at least offering something that works, because it does work using only the CPU as computing unit, a very well documented CoreML processing unit configuration option.
For converting models using onnx, all computing units are in the code.

So, am I the only one?, Could be an update in something else.
This Neural Engine in use, was it something that happened before that I just realized?
When running "sudo powermetrics" in terminal, is there a uncontestable relation between model inference with correML execution provider and ANE usage?
In Activity Monitor, do you also see for ANE nothing signifcant, besides mucho bytes written?
I used sudo powermetrics --hide-cpu-duty-cycle -i 2000

Also, I compiled CoreML for intel x86_64 macs and runs flawlessly on my mac mini 2018, giving a 10 fold increase in frames per seconds (still slow though)

Question, how to use embedding using mps ?

Here's the embedding code :

from optimum.onnxruntime import ORTModelForFeatureExtraction
from transformers import AutoModel, AutoTokenizer
import numpy as np

model_ort = ORTModelForFeatureExtraction.from_pretrained('BAAI/bge-small-en-v1.5', file_name="onnx/model.onnx")
tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-small-en-v1.5')
model = AutoModel.from_pretrained('BAAI/bge-small-en-v1.5')
...
inputs = tokenizer(documents, padding=True, truncation=True, return_tensors='pt', max_length=512)
embeddings = model(**inputs)[0][:, 0].detach().numpy()

It works but only using cpu, when I tried using to("mps"), it wont work

How can I use mps for this scenario ?

Thanks

linux package?

Hi,

I'm trying to use ONNX inside a linux container running on an Mac M1.
Would it make sense to use this package for that? Or should I be able to use another onnxruntime?
Sadly pip refuses to install onnxruntime-silicon because it is not compatible (different OS).

(as a side note: specifying --platform=linux/amd64 for my docker container and using onnxruntime-gpu package instead works, but that requires emulation from qemu for the architecture mismatch and makes things a lot slower)

ERROR: Could not find a version that satisfies the requirement onnxruntime-silicon (from versions: none)

Hi I have an M2 Mac and I'm getting the same module not found error I get with the main onnx-runtime.

What am I doing wrong?

No CoreML backend???

I am doing a face model
https://github.com/iperov/DeepFaceLive/releases/download/ZAHAR_LUPIN/Zahar_Lupin.dfm

environment:
onnxruntime-coreml == 1.13.1
onnxruntime-silicon == 1.13.1

device : Apple Silicon M1

python:
import onnx
import onnxruntime as rt

options = rt.SessionOptions()
options.log_severity_level = 0
options.intra_op_num_threads = 4
options.execution_mode = rt.ExecutionMode.ORT_SEQUENTIAL
options.graph_optimization_level = rt.GraphOptimizationLevel.ORT_ENABLE_ALL

onnx

onnxSession = rt.InferenceSession(onnx_model_path, providers=[ {"CoreMLExecutionProvider"}], options)

logcat:
EP Error using [{'CoreMLExecutionProvider'}]
Falling back to ['CPUExecutionProvider'] and retrying.
2023-07-25 11:11:48.215511 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-07-25 11:11:48.215528 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2023-07-25 11:11:48.215535 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
EP Error using [{'CoreMLExecutionProvider'}]
Falling back to ['CPUExecutionProvider'] and retrying.
2023-07-25 11:11:48.215793 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-07-25 11:11:48.215804 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2023-07-25 11:11:48.215810 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
EP Error using [{'CoreMLExecutionProvider'}]
Falling back to ['CPUExecutionProvider'] and retrying.
2023-07-25 11:11:48.222091 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-07-25 11:11:48.222108 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2023-07-25 11:11:48.222116 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
2023-07-25 11:11:48.290081 [I:onnxruntime:, inference_session.cc:1222 Initialize] Initializing session.
2023-07-25 11:11:48.293215 [I:onnxruntime:, reshape_fusion.cc:42 ApplyImpl] Total fused reshape node count: 0
2023-07-25 11:11:48.295481 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295512 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295524 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295533 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295540 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295547 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295555 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295562 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295569 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295577 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295584 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295591 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295599 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295606 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295613 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295620 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295626 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295634 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.296167 [V:onnxruntime:, session_state.cc:1010 VerifyEachNodeIsAssignedToAnEp] Node placements
2023-07-25 11:11:48.296189 [V:onnxruntime:, session_state.cc:1013 VerifyEachNodeIsAssignedToAnEp] All nodes placed on [CPUExecutionProvider]. Number of nodes: 69
2023-07-25 11:11:48.296278 [V:onnxruntime:, session_state.cc:66 CreateGraphInfo] SaveMLValueNameIndexMapping
2023-07-25 11:11:48.296307 [V:onnxruntime:, session_state.cc:112 CreateGraphInfo] Done saving OrtValue mappings.
2023-07-25 11:11:48.296561 [I:onnxruntime:, session_state_utils.cc:199 SaveInitializedTensors] Saving initialized tensors.
2023-07-25 11:11:48.297107 [I:onnxruntime:, session_state_utils.cc:342 SaveInitializedTensors] Done saving initialized tensors

###############
The onruntime-silicon has been installed, but it only supports the CPU backend, not the CoreML backend, I don’t know the reason, can you help me.

Thanks in advance!

Building from source fails on M2 Pro

Python 3.9.16
I tried to build from source by running ./build-macos.sh

[ 23%] Building CXX object external/protobuf/cmake/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/util/type_resolver_util.cc.o
[ 23%] Building CXX object external/protobuf/cmake/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/wire_format.cc.o
[ 23%] Building CXX object external/protobuf/cmake/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/wrappers.pb.cc.o
[ 23%] Linking CXX static library libprotobuf.a
[ 23%] Built target libprotobuf
make: *** [all] Error 2
Traceback (most recent call last):
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 2812, in <module>
    sys.exit(main())
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 2727, in main
    build_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target)
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 1349, in build_targets
    run_subprocess(cmd_args, env=env)
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 740, in run_subprocess
    return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
  File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/python/util/run.py", line 49, in run
    completed_process = subprocess.run(
  File "/opt/homebrew/Cellar/[email protected]/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/opt/homebrew/bin/cmake', '--build', '/Users/X/git/onnxruntime-silicon/onnxruntime/build/MacOS/Release', '--config', 'Release', '--', '-j8']' returned non-zero exit status 2.

Possible supports for using "GPU" instead of "CPU" on arm macs(m1/m2/etc) now?

#!/usr/bin/env python3
import onnxruntime as rt

import numpy
from onnxruntime.datasets import get_example

print(rt.get_device())
print(rt.__version__)
print('========')


def test():
    print("running simple inference test...")
    example1 = get_example("sigmoid.onnx")
    sess = rt.InferenceSession(example1, providers=rt.get_available_providers())

    input_name = sess.get_inputs()[0].name
    print("input name", input_name)
    input_shape = sess.get_inputs()[0].shape
    print("input shape", input_shape)
    input_type = sess.get_inputs()[0].type
    print("input type", input_type)

    output_name = sess.get_outputs()[0].name
    print("output name", output_name)
    output_shape = sess.get_outputs()[0].shape
    print("output shape", output_shape)
    output_type = sess.get_outputs()[0].type
    print("output type", output_type)

    import numpy.random

    x = numpy.random.random((3, 4, 5))
    x = x.astype(numpy.float32)
    res = sess.run([output_name], {input_name: x})
    print(res)

def main():
    runtimes = ", ".join(rt.get_available_providers())
    print()
    print(f"Available Providers: {runtimes}")
    print()

    test()

if __name__=="__main__":
    main()

output

CPU
1.16.3
========

Available Providers: CoreMLExecutionProvider, CPUExecutionProvider

running simple inference test...
input name x
input shape [3, 4, 5]
input type tensor(float)
output name y
output shape [3, 4, 5]
output type tensor(float)
[array([[[0.57910156, 0.61865234, 0.5834961 , 0.7050781 , 0.6503906 ],
        [0.64160156, 0.63183594, 0.6098633 , 0.73046875, 0.7211914 ],
        [0.71875   , 0.63964844, 0.5595703 , 0.6591797 , 0.5629883 ],
        [0.5786133 , 0.71435547, 0.56591797, 0.51904297, 0.62353516]],

       [[0.7265625 , 0.5600586 , 0.7290039 , 0.68115234, 0.7109375 ],
        [0.6035156 , 0.61376953, 0.69091797, 0.61279297, 0.55810547],
        [0.52685547, 0.56103516, 0.69921875, 0.5004883 , 0.6533203 ],
        [0.7182617 , 0.66308594, 0.7163086 , 0.58984375, 0.71728516]],

       [[0.546875  , 0.6982422 , 0.58935547, 0.73095703, 0.55371094],
        [0.609375  , 0.6928711 , 0.5371094 , 0.68847656, 0.6147461 ],
        [0.5859375 , 0.72216797, 0.625     , 0.52246094, 0.59716797],
        [0.6777344 , 0.59033203, 0.64941406, 0.6425781 , 0.71191406]]],
      dtype=float32)]

[Process exited 0]

currently only CPU supported.
thanks

is amd gpu supported by this?

can it support the amd gpu if i compile this for my x86_64 macos?

[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Failed to find kernel for Split(18) (node Split). Kernel not found

I'm trying to run inferences on Depth Anything ONNX models on macOS M1 in python with onnxruntime-silicon. The models are converted from pth to ONNX.

I can run the inference on CPU but I got the following error while running on GPU:

onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Failed to find kernel for Split(18) (node Split). Kernel not found

Here is my python code:

import cv2
import numpy as np
import onnxruntime
from depth_anything.util.transform import load_image
import os

def postprocess(depth, original_size):
    # Resize, apply a color map and save the depth map
    depth = cv2.resize(depth[0, 0], original_size)
    depth = (depth - depth.min()) / (depth.max() - depth.min()) * 255.0
    depth = depth.astype(np.uint8)
    depth_color = cv2.applyColorMap(depth, cv2.COLORMAP_INFERNO)
    cv2.imwrite("depth.jpg", depth_color)

def run_onnx(image_path, model_path):
    # Preprocess the image
    image, (orig_h, orig_w) = load_image(image_path)
    # Load the model
    session = onnxruntime.InferenceSession(model_path, providers=onnxruntime.get_available_providers()) # ['CPUExecutionProvider'] | onnxruntime.get_available_providers()
    # Run the model
    depth = session.run(None, {"image": image})[0]
    # Save depth map
    postprocess(depth, (orig_w, orig_h))

if __name__ == "__main__":
    os.chdir('/Users/marinnagy/Documents/Programmation/Python/Depth Anything')
    input_img_path = "assets/frame.jpg"
    model_path = "weights/depth_anything_vits14.onnx"
    depth_map = run_onnx(input_img_path, model_path)

I managed to run inferences on MiDaS models without issue, but it seems like Depth Anything use a Split node that is not currently implemented. Any suggestion on how to get around this issue (replace or implement the node?) is welcome, thank you !

PS : I found this thread that talk about a similar error but couldn't really use it for my issue. Also here is the Split operator from the official onnx GitHub repo

Question: How to use onnxruntime-silicon in onnxruntime-nodejs ?

Hi, I'm testing fastembed.js that using onnx (from nodejs), how can I replace the default onnxruntime with onnxruntime-silicon ?

Thanks

Build error with ORT version 1.15.0

I'm trying to compile onnxruntime-silicon 1.15.0 on my MacBook Pro M1 Pro running macOS Ventura and the process fails with a compilation error. I don't have much information excepted that it may be caused by libprotobuf. I installed the build dependencies using brew beforehand as mentioned in the repo so libprotobuf should be installed.

Does it work with python?

Awesome project, thanks!
Do you actually get it working? Officially the ONNX CoreML provider is not supporting python API.
Indeed after rebuilding the ort.get_available_providers() only returns the CPU one..

Thanks in advance!

performance drop on macos 14

Thanks for the great work. I`ve been using this since ort-1.13, on a MBP with M1 Pro chip.

The problem is, after I updated my system from macos 13 to 14, all the models using coreml EP become slower than before the updating (still faster than using cpu EP though). The performance drop ~50-75% in average. I didn't make a time machine backup before updating, so it's not a good idea to downgrade the system back to 13.

I've made it to manually build a wheel instead of pip install onnxruntime-silicon, but the performance remains the same.

Would you support macos 14 recently?

Please release 1.17.0

Hey my friend,

please release onnxruntime-silicon==1.17.0

I guess it's the perfect opportunity to thank for all the efforts in the past months.

cansik / onnxruntime-silicon Goto Github PK

onnxruntime-silicon's Issues

onnx

output

Recommend Projects

Recommend Topics

Recommend Org