cansik / onnxruntime-silicon Goto Github PK
View Code? Open in Web Editor NEWONNX Runtime prebuilt wheels for Apple Silicon (M1 / M2 / M3 / ARM64)
License: MIT License
ONNX Runtime prebuilt wheels for Apple Silicon (M1 / M2 / M3 / ARM64)
License: MIT License
Hi,
I just installed onnxruntime-silicon (it appears in my pip list) but struggle to import it. If I just type in 'import onnxruntime-silicon' I get a 'SyntaxError: invalid syntax' probably because of the '-'. I tried using importlib to import the module (by typing onnxsilicon = importlib.import_module('onnxruntime-silicon') ) but I still get an error (ModuleNotFoundError: No module named 'onnxruntime-silicon').
Could you please help me to import the module? Thanks!
I'm not yet sure what I did, I messed around with the onnxruntime source to make sure that CPU-only provider flag was changed, casually into require ANE., and eventually got it to build.
Running some face restoration onnx model, I was checking "powermetrics", disapointed to see GPU at 0, with my M2 in low power mode, and then I plugged it to power, and suddenly ANE power usage spiked, maintaining it self around 10%.
I checked with the M1 max, and same, ANE in use topping 20, 25% sometimes.
I had never seen ANE being used before, ceteris paribus besides two things:
either my swapping of every "bicubic", non-core-ml supported operation, into "bilinear", and the runtime I'v been trying to figure out.
In this matter, I came to various conclusions, or hypothesises, one of which I chose the words carefully:
. We should consider that we have only 1 device (eg: M1 pro) that has 3 computing units (CPU, GPU, ANE).
-CoreML is one way to communicate with the device, there are alternatives, like Metal, OpenCL, Vulkan, WebGL, ARM providers)
-In this particular context of ours, onnxruntime, we are at the same level as 3 years ago.
I think there is some sort of confusion since the moment it was decided to limit to CPU. I am not saying it's a feature and not a bug, because it was clearly stated that this decision was for at least offering something that works, because it does work using only the CPU as computing unit, a very well documented CoreML processing unit configuration option.
For converting models using onnx, all computing units are in the code.
So, am I the only one?, Could be an update in something else.
This Neural Engine in use, was it something that happened before that I just realized?
When running "sudo powermetrics" in terminal, is there a uncontestable relation between model inference with correML execution provider and ANE usage?
In Activity Monitor, do you also see for ANE nothing signifcant, besides mucho bytes written?
I used sudo powermetrics --hide-cpu-duty-cycle -i 2000
Also, I compiled CoreML for intel x86_64 macs and runs flawlessly on my mac mini 2018, giving a 10 fold increase in frames per seconds (still slow though)
Here's the embedding code :
from optimum.onnxruntime import ORTModelForFeatureExtraction
from transformers import AutoModel, AutoTokenizer
import numpy as np
model_ort = ORTModelForFeatureExtraction.from_pretrained('BAAI/bge-small-en-v1.5', file_name="onnx/model.onnx")
tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-small-en-v1.5')
model = AutoModel.from_pretrained('BAAI/bge-small-en-v1.5')
...
inputs = tokenizer(documents, padding=True, truncation=True, return_tensors='pt', max_length=512)
embeddings = model(**inputs)[0][:, 0].detach().numpy()
It works but only using cpu, when I tried using to("mps"), it wont work
How can I use mps for this scenario ?
Thanks
Hi,
I'm trying to use ONNX inside a linux container running on an Mac M1.
Would it make sense to use this package for that? Or should I be able to use another onnxruntime?
Sadly pip refuses to install onnxruntime-silicon because it is not compatible (different OS).
(as a side note: specifying --platform=linux/amd64
for my docker container and using onnxruntime-gpu package instead works, but that requires emulation from qemu for the architecture mismatch and makes things a lot slower)
Hi I have an M2 Mac and I'm getting the same module not found error I get with the main onnx-runtime.
What am I doing wrong?
I am doing a face model
https://github.com/iperov/DeepFaceLive/releases/download/ZAHAR_LUPIN/Zahar_Lupin.dfm
environment:
onnxruntime-coreml == 1.13.1
onnxruntime-silicon == 1.13.1
device : Apple Silicon M1
python:
import onnx
import onnxruntime as rt
options = rt.SessionOptions()
options.log_severity_level = 0
options.intra_op_num_threads = 4
options.execution_mode = rt.ExecutionMode.ORT_SEQUENTIAL
options.graph_optimization_level = rt.GraphOptimizationLevel.ORT_ENABLE_ALL
onnxSession = rt.InferenceSession(onnx_model_path, providers=[ {"CoreMLExecutionProvider"}], options)
logcat:
EP Error using [{'CoreMLExecutionProvider'}]
Falling back to ['CPUExecutionProvider'] and retrying.
2023-07-25 11:11:48.215511 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-07-25 11:11:48.215528 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2023-07-25 11:11:48.215535 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
EP Error using [{'CoreMLExecutionProvider'}]
Falling back to ['CPUExecutionProvider'] and retrying.
2023-07-25 11:11:48.215793 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-07-25 11:11:48.215804 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2023-07-25 11:11:48.215810 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
EP Error using [{'CoreMLExecutionProvider'}]
Falling back to ['CPUExecutionProvider'] and retrying.
2023-07-25 11:11:48.222091 [I:onnxruntime:, inference_session.cc:263 operator()] Flush-to-zero and denormal-as-zero are off
2023-07-25 11:11:48.222108 [I:onnxruntime:, inference_session.cc:271 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2023-07-25 11:11:48.222116 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Dynamic block base set to 0
2023-07-25 11:11:48.290081 [I:onnxruntime:, inference_session.cc:1222 Initialize] Initializing session.
2023-07-25 11:11:48.293215 [I:onnxruntime:, reshape_fusion.cc:42 ApplyImpl] Total fused reshape node count: 0
2023-07-25 11:11:48.295481 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295512 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295524 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295533 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295540 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295547 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295555 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295562 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295569 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295577 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295584 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295591 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295599 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295606 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295613 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295620 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295626 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.295634 [V:onnxruntime:, selector_action_transformer.cc:129 MatchAndProcess] Matched Conv
2023-07-25 11:11:48.296167 [V:onnxruntime:, session_state.cc:1010 VerifyEachNodeIsAssignedToAnEp] Node placements
2023-07-25 11:11:48.296189 [V:onnxruntime:, session_state.cc:1013 VerifyEachNodeIsAssignedToAnEp] All nodes placed on [CPUExecutionProvider]. Number of nodes: 69
2023-07-25 11:11:48.296278 [V:onnxruntime:, session_state.cc:66 CreateGraphInfo] SaveMLValueNameIndexMapping
2023-07-25 11:11:48.296307 [V:onnxruntime:, session_state.cc:112 CreateGraphInfo] Done saving OrtValue mappings.
2023-07-25 11:11:48.296561 [I:onnxruntime:, session_state_utils.cc:199 SaveInitializedTensors] Saving initialized tensors.
2023-07-25 11:11:48.297107 [I:onnxruntime:, session_state_utils.cc:342 SaveInitializedTensors] Done saving initialized tensors
###############
The onruntime-silicon has been installed, but it only supports the CPU backend, not the CoreML backend, I don’t know the reason, can you help me.
Thanks in advance!
Python 3.9.16
I tried to build from source by running ./build-macos.sh
[ 23%] Building CXX object external/protobuf/cmake/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/util/type_resolver_util.cc.o
[ 23%] Building CXX object external/protobuf/cmake/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/wire_format.cc.o
[ 23%] Building CXX object external/protobuf/cmake/CMakeFiles/libprotobuf.dir/__/src/google/protobuf/wrappers.pb.cc.o
[ 23%] Linking CXX static library libprotobuf.a
[ 23%] Built target libprotobuf
make: *** [all] Error 2
Traceback (most recent call last):
File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 2812, in <module>
sys.exit(main())
File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 2727, in main
build_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target)
File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 1349, in build_targets
run_subprocess(cmd_args, env=env)
File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/ci_build/build.py", line 740, in run_subprocess
return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
File "/Users/X/git/onnxruntime-silicon/onnxruntime/tools/python/util/run.py", line 49, in run
completed_process = subprocess.run(
File "/opt/homebrew/Cellar/[email protected]/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/opt/homebrew/bin/cmake', '--build', '/Users/X/git/onnxruntime-silicon/onnxruntime/build/MacOS/Release', '--config', 'Release', '--', '-j8']' returned non-zero exit status 2.
#!/usr/bin/env python3
import onnxruntime as rt
import numpy
from onnxruntime.datasets import get_example
print(rt.get_device())
print(rt.__version__)
print('========')
def test():
print("running simple inference test...")
example1 = get_example("sigmoid.onnx")
sess = rt.InferenceSession(example1, providers=rt.get_available_providers())
input_name = sess.get_inputs()[0].name
print("input name", input_name)
input_shape = sess.get_inputs()[0].shape
print("input shape", input_shape)
input_type = sess.get_inputs()[0].type
print("input type", input_type)
output_name = sess.get_outputs()[0].name
print("output name", output_name)
output_shape = sess.get_outputs()[0].shape
print("output shape", output_shape)
output_type = sess.get_outputs()[0].type
print("output type", output_type)
import numpy.random
x = numpy.random.random((3, 4, 5))
x = x.astype(numpy.float32)
res = sess.run([output_name], {input_name: x})
print(res)
def main():
runtimes = ", ".join(rt.get_available_providers())
print()
print(f"Available Providers: {runtimes}")
print()
test()
if __name__=="__main__":
main()
CPU
1.16.3
========
Available Providers: CoreMLExecutionProvider, CPUExecutionProvider
running simple inference test...
input name x
input shape [3, 4, 5]
input type tensor(float)
output name y
output shape [3, 4, 5]
output type tensor(float)
[array([[[0.57910156, 0.61865234, 0.5834961 , 0.7050781 , 0.6503906 ],
[0.64160156, 0.63183594, 0.6098633 , 0.73046875, 0.7211914 ],
[0.71875 , 0.63964844, 0.5595703 , 0.6591797 , 0.5629883 ],
[0.5786133 , 0.71435547, 0.56591797, 0.51904297, 0.62353516]],
[[0.7265625 , 0.5600586 , 0.7290039 , 0.68115234, 0.7109375 ],
[0.6035156 , 0.61376953, 0.69091797, 0.61279297, 0.55810547],
[0.52685547, 0.56103516, 0.69921875, 0.5004883 , 0.6533203 ],
[0.7182617 , 0.66308594, 0.7163086 , 0.58984375, 0.71728516]],
[[0.546875 , 0.6982422 , 0.58935547, 0.73095703, 0.55371094],
[0.609375 , 0.6928711 , 0.5371094 , 0.68847656, 0.6147461 ],
[0.5859375 , 0.72216797, 0.625 , 0.52246094, 0.59716797],
[0.6777344 , 0.59033203, 0.64941406, 0.6425781 , 0.71191406]]],
dtype=float32)]
[Process exited 0]
currently only CPU supported.
thanks
can it support the amd gpu if i compile this for my x86_64 macos?
I'm trying to run inferences on Depth Anything ONNX models on macOS M1 in python with onnxruntime-silicon
. The models are converted from pth to ONNX.
I can run the inference on CPU but I got the following error while running on GPU:
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Failed to find kernel for Split(18) (node Split). Kernel not found
Here is my python code:
import cv2
import numpy as np
import onnxruntime
from depth_anything.util.transform import load_image
import os
def postprocess(depth, original_size):
# Resize, apply a color map and save the depth map
depth = cv2.resize(depth[0, 0], original_size)
depth = (depth - depth.min()) / (depth.max() - depth.min()) * 255.0
depth = depth.astype(np.uint8)
depth_color = cv2.applyColorMap(depth, cv2.COLORMAP_INFERNO)
cv2.imwrite("depth.jpg", depth_color)
def run_onnx(image_path, model_path):
# Preprocess the image
image, (orig_h, orig_w) = load_image(image_path)
# Load the model
session = onnxruntime.InferenceSession(model_path, providers=onnxruntime.get_available_providers()) # ['CPUExecutionProvider'] | onnxruntime.get_available_providers()
# Run the model
depth = session.run(None, {"image": image})[0]
# Save depth map
postprocess(depth, (orig_w, orig_h))
if __name__ == "__main__":
os.chdir('/Users/marinnagy/Documents/Programmation/Python/Depth Anything')
input_img_path = "assets/frame.jpg"
model_path = "weights/depth_anything_vits14.onnx"
depth_map = run_onnx(input_img_path, model_path)
I managed to run inferences on MiDaS models without issue, but it seems like Depth Anything use a Split
node that is not currently implemented. Any suggestion on how to get around this issue (replace or implement the node?) is welcome, thank you !
PS : I found this thread that talk about a similar error but couldn't really use it for my issue. Also here is the Split operator from the official onnx GitHub repo
Hi, I'm testing fastembed.js that using onnx (from nodejs), how can I replace the default onnxruntime with onnxruntime-silicon ?
Thanks
I'm trying to compile onnxruntime-silicon 1.15.0 on my MacBook Pro M1 Pro running macOS Ventura and the process fails with a compilation error. I don't have much information excepted that it may be caused by libprotobuf
. I installed the build dependencies using brew beforehand as mentioned in the repo so libprotobuf
should be installed.
Thanks for the great work. I`ve been using this since ort-1.13, on a MBP with M1 Pro chip.
The problem is, after I updated my system from macos 13 to 14, all the models using coreml EP become slower than before the updating (still faster than using cpu EP though). The performance drop ~50-75% in average. I didn't make a time machine backup before updating, so it's not a good idea to downgrade the system back to 13.
I've made it to manually build a wheel instead of pip install onnxruntime-silicon
, but the performance remains the same.
Would you support macos 14 recently?
Hey my friend,
please release onnxruntime-silicon==1.17.0
I guess it's the perfect opportunity to thank for all the efforts in the past months.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.