pytorch代码，导出ncnn模型，推理测试。 from ultralytics import YOLO from PIL i

用最新的 20240410 版本 pnnx 转换时使用 fp16=0 ncnn 禁用fp16 <a href="https://github.com/Ten

yolov8n模型在鲲鹏ARM机器的检测结果和pytorch结果不一样 about ncnn HOT 6 CLOSED

zhenjing commented on May 24, 2024

yolov8n模型在鲲鹏ARM机器的检测结果和pytorch结果不一样

from ncnn.

Comments (6)

nihui commented on May 24, 2024 1

用最新的 20240410 版本
pnnx 转换时使用 fp16=0
ncnn 禁用fp16 https://github.com/Tencent/ncnn/wiki/FAQ-ncnn-produce-wrong-result#disable-fp16
应该就能获得完全一致的结果了

from ncnn.

nihui commented on May 24, 2024 1

用 ncnn-20240102，禁用fp16，检测结果和pytorch一致。禁用fp16后，速度慢了50%。能否启用fp16，还能得到正确结果？

用 20240410 版本啊

from ncnn.

zhenjing commented on May 24, 2024

用 ncnn-20240102，禁用fp16，检测结果和pytorch一致。
禁用fp16后，速度慢了50%。能否启用fp16，还能得到正确结果？

from ncnn.

zhenjing commented on May 24, 2024

采用 https://github.com/Qengineering/YoloV8-ncnn-Raspberry-Pi-4 的模型，这个模型去掉了部分输出推理逻辑，在启动fp16时，检测结果是正确的。对原始模型的输出是有某些算子不支持吗？

from ncnn.

zhenjing commented on May 24, 2024

20240410 , 启用fp16，能得到正确结果，精度有偏差。

pytorch 检测结果（half=False，不启用fp16）：
原始模型：
[class] [x_center] [y_center] [width] [height] [confidence]
20 0.578238 0.674553 0.595038 0.432779 0.900195
33 0.287819 0.258129 0.104904 0.100881 0.671818
33 0.113862 0.0640004 0.222977 0.128001 0.64798
33 0.248576 0.161419 0.0472315 0.0305715 0.539291
14 0.599585 0.415717 0.0515121 0.0925999 0.472335

ncnn模型：
20 0.578125 0.675 0.59375 0.433594 0.902344
33 0.1125 0.0645264 0.224219 0.129053 0.731445
33 0.287891 0.257812 0.104883 0.101074 0.689941
33 0.248437 0.161426 0.0470703 0.0304688 0.547363
14 0.6 0.416016 0.0527344 0.0919922 0.474854

C++检测结果：
Kunpeng-920 启用fp16
20 = 0.89453 at 417.00 451.00 887.00 x 442.00
33 = 0.72852 at 352.00 211.00 157.00 x 99.00
33 = 0.72705 at 0.00 0.00 337.00 x 127.00
33 = 0.65088 at 337.00 145.00 68.00 x 35.00
14 = 0.63623 at 861.00 373.00 75.00 x 90.00

Kunpeng-920 不启用fp16
20 = 0.89414 at 416.00 451.00 888.00 x 442.00
33 = 0.73525 at 352.00 210.00 157.00 x 99.00
33 = 0.72803 at 0.00 0.00 337.00 x 128.00
33 = 0.65210 at 337.00 145.00 69.00 x 35.00
14 = 0.62812 at 861.00 373.00 75.00 x 90.00

Intel(R) Xeon(R) Gold 6240 不启用fp16
20 = 0.89415 at 416.00 451.00 888.00 x 442.00
33 = 0.73525 at 352.00 210.00 157.00 x 99.00
33 = 0.72803 at 0.00 0.00 337.00 x 128.00
33 = 0.65210 at 337.00 145.00 69.00 x 35.00
14 = 0.62812 at 861.00 373.00 75.00 x 90.00

C++不启用fp16的检测结果和pytorch不启用fp16推理ncnn模型的精度也存在偏差。
不启用fp16，鲲鹏和x64 检测结果完全一样。

from ncnn.

nihui commented on May 24, 2024

20240410 , 启用fp16，能得到正确结果，精度有偏差。

pytorch 检测结果（half=False，不启用fp16）：原始模型： [class] [x_center] [y_center] [width] [height] [confidence] 20 0.578238 0.674553 0.595038 0.432779 0.900195 33 0.287819 0.258129 0.104904 0.100881 0.671818 33 0.113862 0.0640004 0.222977 0.128001 0.64798 33 0.248576 0.161419 0.0472315 0.0305715 0.539291 14 0.599585 0.415717 0.0515121 0.0925999 0.472335

ncnn模型： 20 0.578125 0.675 0.59375 0.433594 0.902344 33 0.1125 0.0645264 0.224219 0.129053 0.731445 33 0.287891 0.257812 0.104883 0.101074 0.689941 33 0.248437 0.161426 0.0470703 0.0304688 0.547363 14 0.6 0.416016 0.0527344 0.0919922 0.474854

C++检测结果： Kunpeng-920 启用fp16 20 = 0.89453 at 417.00 451.00 887.00 x 442.00 33 = 0.72852 at 352.00 211.00 157.00 x 99.00 33 = 0.72705 at 0.00 0.00 337.00 x 127.00 33 = 0.65088 at 337.00 145.00 68.00 x 35.00 14 = 0.63623 at 861.00 373.00 75.00 x 90.00

Kunpeng-920 不启用fp16 20 = 0.89414 at 416.00 451.00 888.00 x 442.00 33 = 0.73525 at 352.00 210.00 157.00 x 99.00 33 = 0.72803 at 0.00 0.00 337.00 x 128.00 33 = 0.65210 at 337.00 145.00 69.00 x 35.00 14 = 0.62812 at 861.00 373.00 75.00 x 90.00

Intel(R) Xeon(R) Gold 6240 不启用fp16 20 = 0.89415 at 416.00 451.00 888.00 x 442.00 33 = 0.73525 at 352.00 210.00 157.00 x 99.00 33 = 0.72803 at 0.00 0.00 337.00 x 128.00 33 = 0.65210 at 337.00 145.00 69.00 x 35.00 14 = 0.62812 at 861.00 373.00 75.00 x 90.00

C++不启用fp16的检测结果和pytorch不启用fp16推理ncnn模型的精度也存在偏差。不启用fp16，鲲鹏和x64 检测结果完全一样。

这个误差就是 fp16 fp32 造成的，不影响实际使用效果

from ncnn.

yolov8n模型在鲲鹏ARM机器的检测结果和pytorch结果不一样 about ncnn HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent