Git Product home page Git Product logo

libtorch-yolov5's Introduction

Introduction

A LibTorch inference implementation of the yolov5 object detection algorithm. Both GPU and CPU are supported.

Dependencies

  • Ubuntu 16.04
  • CUDA 10.2
  • OpenCV 3.4.12
  • LibTorch 1.6.0

TorchScript Model Export

Please refer to the official document here: ultralytics/yolov5#251

Mandatory Update: developer needs to modify following code from the original export.py in yolov5

# line 29
model.model[-1].export = False

Add GPU support: Note that the current export script in yolov5 uses CPU by default, the "export.py" needs to be modified as following to support GPU:

# line 28
img = torch.zeros((opt.batch_size, 3, *opt.img_size)).to(device='cuda')  
# line 31
model = attempt_load(opt.weights, map_location=torch.device('cuda'))

Export a trained yolov5 model:

cd yolov5
export PYTHONPATH="$PWD"  # add path
python models/export.py --weights yolov5s.pt --img 640 --batch 1  # export

Setup

$ cd /path/to/libtorch-yolo5
$ wget https://download.pytorch.org/libtorch/cu102/libtorch-cxx11-abi-shared-with-deps-1.6.0.zip
$ unzip libtorch-cxx11-abi-shared-with-deps-1.6.0.zip
$ mkdir build && cd build
$ cmake .. && make

To run inference on examples in the ./images folder:

# CPU
$ ./libtorch-yolov5 --source ../images/bus.jpg --weights ../weights/yolov5s.torchscript.pt --view-img
# GPU
$ ./libtorch-yolov5 --source ../images/bus.jpg --weights ../weights/yolov5s.torchscript.pt --gpu --view-img
# Profiling
$ CUDA_LAUNCH_BLOCKING=1 ./libtorch-yolov5 --source ../images/bus.jpg --weights ../weights/yolov5s.torchscript.pt --gpu --view-img

Demo

Bus

Zidane

FAQ

  1. terminate called after throwing an instance of 'c10::Error' what(): isTuple() INTERNAL ASSERT FAILED
  • Make sure "model.model[-1].export = False" when running export script.
  1. Why the first "inference takes" so long from the log?

    • The first inference is slower as well due to the initial optimization that the JIT (Just-in-time compilation) is doing on your code. This is similar to "warm up" in other JIT compilers. Typically, production services will warm up a model using representative inputs before marking it as available.

    • It may take longer time for the first cycle. The yolov5 python version run the inference once with an empty image before the actual detection pipeline. User can modify the code to process the same image multiple times or process a video to get the valid processing time.

References

  1. https://github.com/ultralytics/yolov5
  2. Question about the code in non_max_suppression
  3. https://github.com/walktree/libtorch-yolov3
  4. https://pytorch.org/cppdocs/index.html
  5. https://github.com/pytorch/vision
  6. PyTorch.org - CUDA SEMANTICS
  7. PyTorch.org - add synchronization points
  8. PyTorch - why first inference is slower

libtorch-yolov5's People

Contributors

liej6799 avatar yasenh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libtorch-yolov5's Issues

Does anyone run GPU inference successfully?

I could not run the inference with GPU enabled. I follow the instructions to modify the export.py code to export the torchscript model with GPU, but when inferring with libtorch, it cannot load the weight.

Does anyone know how to solve it?

My OS is windows 10 and it is able to run the CPU torchscript model.

Thanks in advance.

Performance difference between running model in python vs c++?

Hi @yasenh ,
The code is beautifully written in c++. Even I tried but could not perform the end to end successfully due to lack of expertise in c++ in libtorch. Could you also provide some sort of statistics that could tell whether running the model in c++ improves the performance as compared to running the same model in python in GPU?

What I guess is there should not be much difference, even if it exists. And if exists, why such performance difference is coming? I know I am asking too much, but if you could analyze it, it would be really useful.

Python and libtorch model prediction results are inconsistent

Hello, I have updated the version of YOLOv5 (4.0). I found that the prediction results of the python model are a little different from the results predicted by the libtorch model. The prediction results of the 3.1 version are the same. What is the reason? Can you help me, thank you!

推理速度

在对您的代码进行简单修改后,使用循环读取本地文件的方式,用yolov5s模型,进行效果测试,发现推理速度只有两三帧,且查看GPU,发现GPU的占用率很小,所以想问下,该工程是不是不支持模型加载一次,而进行预测。修改代码部分如图。期待您的答复,谢谢!

`// load input image
std::vectorcv::String filenames;
cv::String folder = "/home/xavier/dataset/DF";
cv::glob(folder, filenames);
for(size_t i = 0; i < filenames.size(); ++i)
{
cv::Mat img = cv::imread(filenames[i]);
//std::cout<<"******"<<filenames[i]<<std::endl;
if (img.empty())
{
std::cerr << "Error loading the image!\n";
return -1;
}
// load network
std::string weights = opt["weights"].asstd::string();
auto detector = Detector(weights, device_type);

// set up threshold
float conf_thres = opt["conf-thres"].as<float>();
float iou_thres = opt["iou-thres"].as<float>();

// inference
auto result = detector.Run(img, conf_thres, iou_thres);

// visualize detections
if (opt["view-img"].as<bool>()) {
    Demo(img, result[0], class_names);
}
//cv::destroyAllWindows();

}`

当检测一张没有任何可识别内容的图片,引发错误崩溃。

0x00007FFB68A2A799 处(位于 Demo.exe 中)有未经处理的异常: Microsoft C++ 异常: c10::Error,位于内存位置 0x00000012BCF9BE60 处。

代码中断跳转到 :kernel_lambda.h
auto operator()(Parameters... args) -> decltype(std::declval()(std::forward(args)...)) {
return kernel_func_(std::forward(args)...);
}

调试在这里开始引起崩溃停止:
// get the max classes score at each result (e.g. elements 5-84)
std::tuple<torch::Tensor, torch::Tensor> max_classes = torch::max(det.slice(1, item_attr_size, item_attr_size + num_classes), 1);

./libtorch-yolov5 error

hi
Thank you for your work, when I run "./libtorch-yolov5 /data_1/train_project/OBJ_Detection/yolov5-forward/module/torchscript.pt /data_1/train_project/OBJ_Detection/yolov5-forward/img/000240_01046820200606110918_0035_670_3cls.jpg -gpu".
The following error occurred

terminate called after throwing an instance of 'c10::Error'
what(): isTuple() INTERNAL ASSERT FAILED at /data_1/train_project/OBJ_Detection/yolov5-forward/libtorch/include/ATen/core/ivalue_inl.h:723, please report a bug to PyTorch. Expected Tuple but got GenericList (toTuple at /data_1/train_project/OBJ_Detection/yolov5-forward/libtorch/include/ATen/core/ivalue_inl.h:723)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6a (0x7f7d00dfaaaa in /data_1/train_project/OBJ_Detection/yolov5-forward/libtorch/lib/libc10.so)
frame #1: c10::IValue::toTuple() const & + 0x121 (0x559bed24f2b3 in ./libtorch-yolov5)
frame #2: + 0xef9c (0x559bed245f9c in ./libtorch-yolov5)
frame #3: + 0x4176b (0x559bed27876b in ./libtorch-yolov5)
frame #4: __libc_start_main + 0xe7 (0x7f7cabd8eb97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #5: + 0xcc6a (0x559bed243c6a in ./libtorch-yolov5)

how to solve this problem

发现一个问题,在实际跑的时候前面两张速度很慢

不知道是什么原因,我设置的是一张图一张图过的,有的模型第一张图要7.8秒时间,第二张图也要1.2秒,有的模型第一张图几百毫秒,第二张图最高甚至要30秒。但是很奇怪的是,过了前两张图就全部正常了,总体时间也就几十毫秒左右。

不知道有没有遇到相同问题的,找到原因的,推理时间这块要找原因无从下手啊!

batch_size

hello,thxfor your code ,did you test the batch reference?

I did all the steps but in the make step I get this error

[ 33%] Building CXX object CMakeFiles/libtorch-yolov5.dir/src/detector.cpp.o
In file included from /home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/ArrayRef.h:19:0,
from /home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/MemoryFormat.h:5,
from /home/alikarimi/libtorch-yolov5/libtorch/include/ATen/core/TensorBody.h:5,
from /home/alikarimi/libtorch-yolov5/libtorch/include/ATen/Tensor.h:3,
from /home/alikarimi/libtorch-yolov5/libtorch/include/ATen/Context.h:4,
from /home/alikarimi/libtorch-yolov5/libtorch/include/ATen/ATen.h:5,
from /home/alikarimi/libtorch-yolov5/libtorch/include/torch/csrc/api/include/torch/types.h:3,
from /home/alikarimi/libtorch-yolov5/libtorch/include/torch/script.h:3,
from /home/alikarimi/libtorch-yolov5/include/detector.h:5,
from /home/alikarimi/libtorch-yolov5/src/detector.cpp:1:
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/C++17.h:24:2: error: #error You need C++14 to compile PyTorch
#error You need C++14 to compile PyTorch
^~~~~
In file included from /home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/Exception.h:5:0,
from /home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Device.h:5,
from /home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Allocator.h:6,
from /home/alikarimi/libtorch-yolov5/libtorch/include/ATen/ATen.h:3,
from /home/alikarimi/libtorch-yolov5/libtorch/include/torch/csrc/api/include/torch/types.h:3,
from /home/alikarimi/libtorch-yolov5/libtorch/include/torch/script.h:3,
from /home/alikarimi/libtorch-yolov5/include/detector.h:5,
from /home/alikarimi/libtorch-yolov5/src/detector.cpp:1:
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/StringUtil.h:86:17: error: expected primary-expression before ‘auto’
inline decltype(auto) str(const Args&... args) {
^~~~
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/StringUtil.h:86:17: error: expected ‘)’ before ‘auto’
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/StringUtil.h:86:17: error: expected primary-expression before ‘auto’
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/StringUtil.h:86:17: error: expected primary-expression before ‘auto’
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/StringUtil.h:86:17: error: expected primary-expression before ‘auto’
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/StringUtil.h:86:17: error: expected primary-expression before ‘auto’
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/util/StringUtil.h:86:8: error: expected unqualified-id before ‘decltype’
inline decltype(auto) str(const Args&... args) {
^~~~~~~~
In file included from /home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Device.h:5:0,
from /home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Allocator.h:6,
from /home/alikarimi/libtorch-yolov5/libtorch/include/ATen/ATen.h:3,
from /home/alikarimi/libtorch-yolov5/libtorch/include/torch/csrc/api/include/torch/types.h:3,
from /home/alikarimi/libtorch-yolov5/libtorch/include/torch/script.h:3,
from /home/alikarimi/libtorch-yolov5/include/detector.h:5,
from /home/alikarimi/libtorch-yolov5/src/detector.cpp:1:
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Device.h: In member function ‘void c10::Device::validate()’:
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Device.h:96:5: error: ‘str’ is not a member of ‘c10’
TORCH_CHECK(index_ == -1 || index_ >= 0,
^
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Device.h:98:5: error: ‘str’ is not a member of ‘c10’
TORCH_CHECK(!is_cpu() || index_ <= 0,
^
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Allocator.h: In member function ‘void* c10::Allocator::raw_allocate(size_t)’:
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Allocator.h:163:5: error: ‘str’ is not a member of ‘c10’
AT_ASSERT(dptr.get() == dptr.get_context());
^
/home/alikarimi/libtorch-yolov5/libtorch/include/c10/core/Allocator.h:163:5: error: ‘str’ is not a member of ‘c10’
AT_ASSERT(dptr.get() == dptr.get_context());
^
.....

Why is the detection speed slow on GPU?

Dear author, I use yolov5s.pt to detect images. It takes 289 ms per frame on CPU and 127 ms per frame on rtx3070 graphics card. Why is it slow on GPU? Pytorch version yolov5 takes 11 ms per frame on rtx3070 graphics card。

I look forward to your help!

Export ONNX with CUDA

Hi!
I have modified the "export.py to support GPU,but still receive the following error:

RuntimeError: Input, output and indices must be on the current device

Do you have any suggestions on how this issue can be resolved?
Thanks!

Performance on Win10 with GPU

My device is I5 with GPU 1080TI 11GB, and I have successfully complie and run on WIN10 with GPU, but why the inference take taht much time(Release mode)? Already coment out the warm up part in the main.cpp, and it will still take around 500ms to process one single image. But when I using the same model to do detection in python, it works much more efficient with 20FPS. Dont know whats wrong with my configuration or any other issue is the C++ project decrease the performance.

model.model[-1].export = False 疑问

有一个疑问,在导出模型时model.model[-1].export = False 但是Detect层中还有对每个特征图进行卷积操作的操作,如果导出模型不导出Detect层,模型的输出不就与训练的不一致了么?

there is no clip coords process in your code

Hi, when i use your code , i find a problem. In python version code of yolov5, there is a clip_coords function in /utils/general.py(240 rows) which is to Clip bounding xyxy bounding boxes to image shape (height, width). Sometimes my predict box value may out of image size, so i add a clip coords process in your detector.cpp code. I wonder if I'm doing the right thing. Thank you for sharing.

Error in cmake building

Hi @yasenh
I installed all dependencies and did setup as you said in repo. But when i wanted to build it with cmake(cmake .. && make
) i got this error:
Screenshot from 2020-12-29 05-30-36

Can you please tell me whats the problem ?
Thank you

改成视频检测后,每次进行第二帧的inference时都会多消耗几百倍的时间。

在main.cpp中改写成视频检测,主要代码如下:

`VideoCapture capture;
std::cout << "finish load network and open the video" << std::endl;

capture.open("/home/****/libtorch-yolov5/test.mp4");
if (!capture.isOpened())
    {
        std::cout << "can not open ...\n" << std::endl;
        return -1;
    }
Mat frame;
namedWindow("output",WINDOW_AUTOSIZE);
// set up threshold
float conf_thres = 0.4;//opt["conf-thres"].as<float>();
float iou_thres = 0.5;//opt["iou-thres"].as<float>();
for (;;)
{
    capture >> frame;
    //Mat pic;
    if (frame.empty()) break;
    //imshow("output",frame);
    std::cout << "start forward" <<std::endl;
    auto result = detector.Run(frame, conf_thres, iou_thres);
    Demo(frame, result, class_names);
    imshow("output",frame);
    if (waitKey(33) >= 0) break;
}

capture.release();
cv::destroyAllWindows();
//return 0;`

然后程序运行时,加载模型后第一帧马上就会显示出检测后的图像,也能正确画出检测框,这个过程很快,但第二帧就需要几百倍的时间,在inference阶段。。之后又回复到更短的时间,每次都是这样,换了视频也是如此,我统计了时间:
----------New Frame----------
img size:1080x1920
pre-process takes : 4 ms
inference takes : 137 ms <-------------------------------------------------------137
post-process takes : 19 ms
start forward
----------New Frame----------
img size:1080x1920
pre-process takes : 5 ms
inference takes : 7869 ms <--------------------------------------------------------7869
post-process takes : 24 ms
start forward
----------New Frame----------
img size:1080x1920
pre-process takes : 3 ms
inference takes : 8 ms <------------------------------------------------------------8
post-process takes : 25 ms
start forward
----------New Frame----------
img size:1080x1920
pre-process takes : 4 ms
inference takes : 8 ms <-------------------------------------------------------------8
post-process takes : 23 ms

请问这可能是什么原因造成的呢?
注:不知道有什么作用,所以我取消掉了warm up。

When `PostProcessing()` error occurred

image

  • libtorch 1.5.0 debug.
  • Visual studio 2019
  • Windows 10

First step, PostProcess() with temp_img is fine.
But second step, PostProcess() with my custom image the error occurred

ㄴ When I set half_ to false, the error has gone. Why this error occured? How to run this code with half_

run yolov5 v4.0 error

run: ./libtorch-yolov5 --source ../images/bus.jpg --weights ../weights/yolov5s.torchscript.pt --gpu --view-img
error:
terminate called after throwing an instance of 'torch::jit::ErrorReport'
what():

aten::_convolution(Tensor input, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, bool transposed, int[] output_padding, int groups, bool benchmark, bool deterministic, bool cudnn_enabled) -> (Tensor):
Expected at most 12 arguments but found 13 positional arguments.

why found 13 positional arguments?

system is NVIDIA Jetson Xavier NX and docker
opencv 4.4.0
libtorch 1.6.0
cuda 10.2
yolov5 v4.0

time of post process is way too long(后处理的时间太长了)

在python代码中用yolov5x模型一张图的预测时间是40ms。包括了得到pred的时间和nms的时间。
在c++代码中,pred的时间在20ms左右,但是nms时间达到了50ms,但实际上,nms中最耗时的部分是从gpu到cpu的数据转化,实际的nms计算并不会这样耗时,这个应该是有办法可以优化的,盼能提点一二。

std::bad_alloc

terminate called after throwing an instance of 'std::bad_alloc'
what(): std;;bad_alloc

How to debug?

Hi, thanks for much for creating this repo and it is really awesome.

My question is how to debug with libtorch? Now I face the problem of "segmentation fault(core dumped)" after running the warm-up.

image

I tried to debug with VSCode, but I could not go deep into libtorch library.

Is it compulsory to use debug version of libtorch?

auto detections = output.toTuple()->elements()[0].toTensor();

执行到 auto detections = output.toTuple()->elements()[0].toTensor(); 出现错误中断:

inline c10::intrusive_ptrivalue::Tuple IValue::toTuple() const & {
AT_ASSERT(isTuple(), "Expected Tuple but got ", tagKind());
return toIntrusivePtrivalue::Tuple();
}

No obj problem

hello ,
Let me post a question in your project. In detect.cpp,
// if none remain then process next image
if (det.size(1) == 0) {
continue;
}
det.size(1) == 0 should be det.size(0) == 0 Is that right?

No CUDA for inference

Hi, just wondering if it is possible to build without CUDA? I don't have a NVIDIA GPU so I want to do inference with CPU

Modify LetterboxImage error

Hello, thank you very much for your open source, it helped me a lot, I have a question:
When the model input image size is 640640, the accuracy of the prediction result changes and the reasoning time becomes longer; then I modified LetterboxImage (refer to the python version), the model input image size is 640480, but the error is reported as follows:

terminate called after throwing an instance of 'std::runtime_error'
what(): The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/models/yolo.py", line 45, in forward
_35 = (_4).forward(_34, )
_36 = (_2).forward((_3).forward(_35, ), _29, )
_37 = (_0).forward(_33, _35, (_1).forward(_36, ), )
~~~~~~~~~~~ <--- HERE
_38, _39, _40, _41, = _37
return (_41, [_38, _39, _40])
File "code/torch/models/yolo.py", line 75, in forward
_52 = torch.sub(_51, CONSTANTS.c3, alpha=1)
_53 = torch.to(CONSTANTS.c4, dtype=6, layout=0, device=torch.device("cpu"), pin_memory=None, non_blocking=False, copy=False, memory_format=None)
_54 = torch.mul(torch.add(_52, _53, alpha=1), torch.select(CONSTANTS.c5, 0, 0))
~~~~~~~~~ <--- HERE
_55 = torch.slice(y, 4, 0, 2, 1)
_56 = torch.expand(torch.view(_54, [3, 80, 80, 2]), [1, 3, 80, 80, 2], implicit=True)

Traceback of TorchScript, original code (most recent call last):
/home//PycharmProjects/paper_yolov5/models/yolo.py(57): forward
/home//
//anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
/home////anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
/home//
//PycharmProjects/paper_yolov5/models/yolo.py(137): forward_once
/home////PycharmProjects/paper_yolov5/models/yolo.py(121): forward
/home//
//anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
/home////anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
/home//
//anaconda3/lib/python3.8/site-packages/torch/jit/_trace.py(934): trace_module
/home////anaconda3/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
/home//
//PycharmProjects/paper_yolov5/models/export.py(57):
RuntimeError: The size of tensor a (60) must match the size of tensor b (80) at non-singleton dimension 3

How to solve it, thank you!

The models pt to TorchScript what it's unsuccessful

Hi
python models/export.py --weights yolov5s.pt --img 640 --batch 1

Fusing layers...
Model Summary: 120 layers, 7.06617e+06 parameters, 7.06617e+06 gradients
Traceback (most recent call last):
File "models/export.py", line 41, in
y = model(img) # dry run

.
.
.

type(self).name, name))
torch.nn.modules.module.ModuleAttributeError: 'Detect' object has no attribute 'm'

请教一下cpu推理比gpu快,可能是什么原因?

使用models下的export.py(此文件未改动,来自最新的yolov5)导出模型yolov5s.pt
cpu model
导出

python models\export.py --device cpu

运行

Run once on empty image
----------New Frame----------
pre-process takes : 60 ms
inference takes : 4630 ms
post-process takes : 69 ms
----------New Frame----------
pre-process takes : 77 ms
inference takes : 3762 ms
post-process takes : 155 ms

gpu model
导出

python models\export.py --device 0

运行

Run once on empty image
----------New Frame----------
pre-process takes : 40 ms
inference takes : 2766 ms
post-process takes : 1 ms
----------New Frame----------
pre-process takes : 32 ms
inference takes : 10285 ms
post-process takes : 11 ms

isTuple() INTERNAL ASSERT FAILED

when i run the code,i got this error ,how could i solve it?

terminate called after throwing an instance of 'c10::Error'
what(): isTuple() INTERNAL ASSERT FAILED at "/dxd/libtorch-yolov5/libtorch/include/ATen/core/ivalue_inl.h":842, please report a bug to PyTorch. Expected Tuple but got GenericList
Exception raised from toTuple at /dxd/libtorch-yolov5/libtorch/include/ATen/core/ivalue_inl.h:842 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x69 (0x7f569de58eb9 in /dxd/libtorch-yolov5/libtorch/lib/libc10.so)
frame #1: c10::IValue::toTuple() const & + 0xe5 (0x4206cd in ./libtorch-yolov5)
frame #2: ./libtorch-yolov5() [0x41916a]
frame #3: ./libtorch-yolov5() [0x4316f6]
frame #4: __libc_start_main + 0xf0 (0x7f5654d44840 in /lib/x86_64-linux-gnu/libc.so.6)
frame #5: ./libtorch-yolov5() [0x4176b9]

Aborted (core dumped)

post proccesing takes too long

model forwarding takes only ~5 ms to infer the input blob, but post processing takes about 50 ms , i wonder as pytorch implementation(python) takes only 15 ms for both infer and post processing , but here it's taking too long for post-processing, is there any way to optimize post-processing for low latency.

如何能使用 224X的图

尝试修改 std::vector pad_info = LetterboxImage(img_input, img_input, cv::Size(640, 640)); 为std::vector pad_info = LetterboxImage(img_input, img_input, cv::Size(224, 224)); 出现问题。

Memory leak issues, the program will die

Hi
Memory leak issues, 3 consecutive for ./libtorch-yolov5 --source ../images/bus.jpg --weights ../weights/yolov5s.torchscript.pt --gpu --view-img

If the memory is not released, the program will die.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.