Git Product home page Git Product logo

bonnet's People

Contributors

sjdrc avatar tano297 avatar zaher88abd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bonnet's Issues

Error in roslaunch bonnet_run bonnet_run.launch

I am running the ROS node on my PC. I downloaded the persons_512 model and froze it using python3 cnn_freeze.py -p ../../../../Downloads/persons_512 -l ../../../../Downloads/persons_512_frozen/

Later I changed the config file to the new model path model_path: "/home/padmaja/Downloads/persons_512_frozen"

However, I get the following error:
Full model path: /home/padmaja/Downloads/persons_512_frozen//optimized_tRT.uff
Can't open one of the node names from the nodes.yaml fileyaml-cpp: error at line 0, column 0: bad conversion
Closing engine and exiting.
Unable to init. network.
Failed to initialize CNN
[ERROR] [1560242291.484392845]: SOMETHING WENT WRONG INITIALIZING CNN. EXITING

Thanks in advance!

Installation on Jetson TX2

Hi,

I would like to use Bonnet on a Jetson TX2 to do inference.
I try to install it with docker (by using indicated procedure) but it doesn't work.
Someone has already succeed in this task?

Thank you,

How to make pre-trained models

Hello.

It is good work.
I ran bonnet docker and it works well.
I download pre-trained models (people) and tested.
https://www.youtube.com/watch?v=rl0X4XKp4k0

Actually I am beginner on tensorflow/CNN.
If I want to make a pre-trained model, What should I do?

How to use existed dataset and model for making the pre-trained model?

Actually I finally want to make pre-trained models related to indoor environment (wall, door, ground) for robot navigation.

I appreciated your work and I agree what you said.

We strongly believe that our framework
allows the scientific robotics community to save time on
the CNN implementations, enabling researchers to spend
more time to focus on how such approaches can aid robot
perception, localization, mapping, path planning, obstacle
avoidance, manipulation, safe navigation, etc.

Thank you.

own data preparation

Hi, thank you for the great work! I've successfully tried your trained model in ROS, it looks great!

Just a few questions to start training my own model:

  1. Is that possible to provide more instructions on how to train on own dataset? I have my dataset ready, which has images and labels (class number from 0-11). So I don't need any remap work but just need to rearrange them to match the format you provided in Issue #14 . However if I'd like to use your augmentation pipeline, how should I do it? Like which file I should run, what command I need, etc.

  2. Do you have a general idea how long the training may take if I have one 1080Ti GPU and 10k images?

  3. I guess all the data loaders assume img and lbl have same name? otherwise need preprocess to save them as same name?

Thanks again for the help!

ros catkin build can't find TensorRT

Hi, I was trying to deploy bonnet on NVIDIA DRIVE PX2. I think TensorRT is already installed for PX2. But when I tried to catkin build, it complained didn't find Tensorflow or TensorRT.. Is there any specific setup I need to do to fix this problem?

tensorrt

make failed

I installed tensorflow under an env with anaconda , so I try to make bonnet under the same env , but there is an error :
catkin build bonnet_standalone
ImportError: "from catkin_pkg.package import parse_package" failed: No module named catkin_pkg.package
Make sure that you have installed "catkin_pkg", it is up to date and on the PYTHONPATH.
CMake Error at /opt/ros/kinetic/share/catkin/cmake/safe_execute_process.cmake:11 (message):
execute_process(/home/zqk/anaconda2/envs/tensorflow/bin/python
"/opt/ros/kinetic/share/catkin/cmake/parse_package_xml.py"
"/opt/ros/kinetic/share/catkin/cmake/../package.xml"
"/home/zqk/detection_code/bonnet/deploy_cpp/build/catkin_tools_prebuild/catkin/catkin_generated/version/package.cmake")
returned error code 1
Call Stack (most recent call first):
/opt/ros/kinetic/share/catkin/cmake/catkin_package_xml.cmake:74 (safe_execute_process)
/opt/ros/kinetic/share/catkin/cmake/all.cmake:151 (_catkin_package_xml)
/opt/ros/kinetic/share/catkin/cmake/catkinConfig.cmake:20 (include)
CMakeLists.txt:4 (find_package)

when I run python -V,it is:
Python 2.7.14 :: Anaconda, Inc.
but after I exit the env and make the bonnet , the error comes again and it failed .
can you plz help me? thanks

No matching distribution found for uff==0.2.0

Could not find a version that satisfies the requirement uff==0.2.0 (from -r train_py/requirements.txt (line 4)) (from versions: )
No matching distribution found for uff==0.2.0 (from -r train_py/requirements.txt (line 4))

is there something wrong with the requirements.txt file ?

Keras implementation

Hi
This not an issue, I am trying to implement your model in this link; so, is there any comment or papers related to this one, and can you share them with me?

Thanks.

ros not catkin_make

CMakeFiles/Makefile2:445: recipe for target 'bonnet/src/lib/CMakeFiles/bonnet_core.dir/all' failed
make[1]: *** [bonnet/src/lib/CMakeFiles/bonnet_core.dir/all] Error 2
make[1]: *** 正在等待未完成的任务....
[ 19%] Generating C++ code from darknet_ros_msgs/CheckForObjectsGoal.msg
[ 19%] Generating Lisp code from darknet_ros_msgs/CheckForObjectsActionGoal.msg
[ 20%] Generating Lisp code from darknet_ros_msgs/CheckForObjectsFeedback.msg
[ 20%] Built target usb_cam
[ 21%] Generating Lisp code from darknet_ros_msgs/CheckForObjectsActionResult.msg
[ 21%] Generating Lisp code from darknet_ros_msgs/CheckForObjectsGoal.msg
[ 21%] Generating C++ code from darknet_ros_msgs/CheckForObjectsAction.msg
[ 23%] Generating Lisp code from darknet_ros_msgs/CheckForObjectsAction.msg
[ 23%] Built target darknet_ros_msgs_generate_messages_lisp
[ 23%] Built target darknet_ros_msgs_generate_messages_cpp
Makefile:138: recipe for target 'all' failed
make: *** [all] Error 2
Invoking "make -j4 -l4" failed

Multi-input multi-output model

This is not an issue, it is a question.
There is a fast way to create a Multi-input multi-output model because I read the code and I found it needs to create a new abstract_net class. please, correct me if I am wrong.

F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms

sudo nvidia-docker run -ti --rm -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v $HOME/.Xauthority:/home/developer/.Xauthority -v /home/$USER/tests_bonnet:/home/developer/bonnet_wrkdir/tests --net=host --pid=host --ipc=host bonnet /bin/bash

$ sudo ./cnn_use.py -l ../tests/logs/ -p ../tests/pretrained/city_512 -i ../tests/images/0bd9-1520278753069_c.jpg

INTERFACE:
Image to infer: ['../tests/images/0bd9-1520278753069_c.jpg']
Label: None
Log dir: ../tests/logs/
model path ../tests/pretrained/city_512
model type iou
data yaml: None
net yaml: None
train yaml: None
Verbose?: False
Features?: False
Probabilities?: False

Opening default data file data.yaml from log folder
Opening default net file net.yaml from log folder
Opening default train file train.yaml from log folder
Model folder exists! Using model from ../tests/pretrained/city_512/iou
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Fetching dataset
DEVICE AVAIL: /device:CPU:0
DEVICE AVAIL: /device:GPU:0
Initializing network
Building graph
encoder
downsample1
W: [5, 5, 3, 13] Train: False
.
.
.


Total number of parameters in network: 1,871,287


Predicting mask
mask shape [1, 256, 512]
Restoring checkpoint
Looking for model in ../tests/pretrained/city_512/iou
Retrieving model from: ../tests/pretrained/city_512/iou/model-best-iou.ckpt
Successfully restored model weights! :D
Saving this graph in ../tests/logs/
2018-04-23 21:23:44.476628: F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
Aborted (core dumped)

Steps to use ROS node with docker image?

Hi,

I am using the docker image. I don't quite understand how to get the ROS node running.

$ roslaunch bonnet_run bonnet_run.launch
...
[ INFO] [1525824328.942834758]: Successfully launched node.
Unable to create network. 
Invalid yaml file /home/developer/Desktop/frozen_trial//train.yaml
[ERROR] [1525824328.942988464]: SOMETHING WENT WRONG INITIALIZING CNN. EXITING
[bonnet_node-2] process has died [pid 28742, exit code 1, cmd /home/developer/catkin_ws/devel/lib/bonnet_run/bonnet_node __name:=bonnet_node __log:=/home/developer/.ros/log/ae1c784a-531c-11e8-9312-38d547df6dab/bonnet_node-2.log].
log file: /home/developer/.ros/log/ae1c784a-531c-11e8-9312-38d547df6dab/bonnet_node-2*.log

It makes sense there is no model at that path, but I'm not sure how to proceed. In particular, I passed a model in through one of the args when starting docker as mentioned in the readme, but I don't know how to access this directory within the docker. If I just want to test a pre-trained model, does it make sense to download within the docker and freeze it?

I greatly appreciate all the documentation you have so far!

./cnn_train.py -d cfg/cityscapes/data.yaml -n cfg/cityscapes/net_bonnet.yaml -t cfg/cityscapes/train_bonnet.yaml -l ../log/ -p ../pretrained

test16@test16:~/bonnet/train_py$ ./cnn_train.py -d cfg/cityscapes/data.yaml -n cfg/cityscapes/net_bonnet.yaml -t cfg/cityscapes/train_bonnet.yaml -l ../log/ -p ../pretrained

INTERFACE:
data yaml: cfg/cityscapes/data.yaml
net yaml: cfg/cityscapes/net_bonnet.yaml
train yaml: cfg/cityscapes/train_bonnet.yaml
log dir ../log/
model path ../pretrained
model type iou

Commit hash (training version): b'2ce83b0'

Opening desired data file cfg/cityscapes/data.yaml
Opening desired net file cfg/cityscapes/net_bonnet.yaml
Opening desired train file cfg/cityscapes/train_bonnet.yaml
model folder exists! Using model from ../pretrained/iou
Copying files to ../log/ for further reference.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Failed to connect to Mir: Failed to connect to server socket: No such file or directory
Unable to init server: Could not connect: Connection refused
Failed to connect to Mir: Failed to connect to server socket: 没有那个文件或目录
Unable to init server: 无法连接: Connection refused

(cnn_train.py:7017): Gdk-CRITICAL **: gdk_cursor_new_for_display: assertion 'GDK_IS_DISPLAY (display)' failed
Traceback (most recent call last):
File "./cnn_train.py", line 174, in
NET["name"] + '.py')
File "/usr/lib/python3.5/imp.py", line 172, in load_source
module = _load(spec)
File "", line 693, in _load
File "", line 673, in _load_unlocked
File "", line 665, in exec_module
File "", line 222, in _call_with_frames_removed
File "/home/test16/bonnet/train_py/arch/bonnet.py", line 28, in
from arch.abstract_net import AbstractNetwork
File "/home/test16/bonnet/train_py/arch/abstract_net.py", line 42, in
import dataset.aux_scripts.util as util
File "/home/test16/bonnet/train_py/dataset/aux_scripts/util.py", line 22, in
import matplotlib.pyplot as plt
File "/usr/local/lib/python3.5/dist-packages/matplotlib/pyplot.py", line 114, in
_backend_mod, new_figure_manager, draw_if_interactive, _show = pylab_setup()
File "/usr/local/lib/python3.5/dist-packages/matplotlib/backends/init.py", line 32, in pylab_setup
globals(),locals(),[backend_name],0)
File "/usr/local/lib/python3.5/dist-packages/matplotlib/backends/backend_gtk3agg.py", line 11, in
from . import backend_gtk3
File "/usr/local/lib/python3.5/dist-packages/matplotlib/backends/backend_gtk3.py", line 58, in
cursors.MOVE : Gdk.Cursor.new(Gdk.CursorType.FLEUR),
TypeError: constructor returned NULL

Value Error in loss_f()

This error occurs on some dataset not all of them:
I don't know why

`Weights for loss function (median frec/frec(c)):
[0.07206047 0.98466394 1.3721456 ]
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1628, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 1867776 and 1800000 for 'train_model/model/loss_0/loss/Pow' (op: 'Pow') with input shapes: [1867776,3], [1800000,3].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./cnn_train.py", line 186, in
net.train()
File "/home/ubuntu/bonnet_run/bonnet/train_py/arch/abstract_net.py", line 1076, in train
self.TRAIN["loss"], self.TRAIN["w_decay"])
File "/home/ubuntu/bonnet_run/bonnet/train_py/arch/abstract_net.py", line 145, in loss_f
focal_softmax = tf.pow(1 - softmax_mat, gamma_tf) *
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 444, in pow
return gen_math_ops._pow(x, y, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 5293, in _pow
"Pow", x=x, y=y, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1792, in init
control_input_ops)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1631, in _create_c_op
raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 1867776 and 1800000 for 'train_model/model/loss_0/loss/Pow' (op: 'Pow') with input shapes: [1867776,3], [1800000,3].`

ros catkin build Error in netTRT.cpp

Hallo,

I tried to build the package in ROS Kinetic on Jetson TX2 (Jetpack 3.3) and received following error:
image

Do anyone know why I receive this message?

Thanks

Jetson TX1

I was able to run the inference using the python files on a host PC and on the Jetson. However I could not run the Tensorrt optimised cnn_use_pb_tRT on the Jetson because Tensorrt python libraries aren't available for the arm64 environment of Jetson. Just to clarify to actually compile the deploy cop

  1. build Basel from source for arm64
  2. Build the tensorflow_cc external install for arm64
  3. Then use catkin to build the executable right?
    Thank you.

Error opening data.yaml file

I'm trying to use the pretrained model to generate masks for a video. I followed the instructions on how to run it
python ./cnn_video.py small.mp4 -p cnn_video_pb.py

and it gives me this error:

`INTERFACE:
Video to infer:
Log dir: /tmp/net_predict_log
model path cnn_video_pb.py
model type iou
data yaml: None
net yaml: None
train yaml: None
Verbose?: False

Opening default data file data.yaml from log folder
Error opening data yaml file...
`

roslaunch bonnet_run bonnet_run.launch error?

This is the docker I run,
i run roslaunch bonnet_run bonnet_run.launch

Unable to create network.
Invalid yaml file /shared/city_512//nodes.yaml
[ERROR] [1526639307.186656472]: SOMETHING WENT WRONG INITIALIZING CNN. EXITING
[bonnet_node-2] process has died [pid 27831, exit code 1, cmd /home/developer/catkin_ws/devel/lib/bonnet_run/bonnet_node __name:=bonnet_node __log:=/home/developer/.ros/log/32c8b5d4-5a86-11e8-b58d-507b9deaef1f/bonnet_node-2.log].
log file: /home/developer/.ros/log/32c8b5d4-5a86-11e8-b58d-507b9deaef1f/bonnet_node-2*.log

graph output of uff file for TensorRT

Hey Andres,

I believe i understand whats going on, but id like to be sure. the UFF file for deploying the model in tensorrt outputs the unnormalized logits layer from the neural network, as opposed to the mask?

Question about deployment

Hi,
I have run the train and inference scripts of bonnet sucessfully on x86 sever or pc.
And I aslo find you have tested on Jetson TX2.
I'd like to know how to deploy your script on arm platform[nvidia PX2, other gpu devices] with easiest methods.
Do you have more deployment details about nvidia-docker ?
What if I will make some small change on your code, how to wrap a new docker image?

ROS Unable to create network. Invalid yaml file ~/log/train.yaml

SOLVED: For file path it has to be /home/$USER and not ~

Hi,

I'm running roslaunch bonnet_run bonnet_run.launch in the provided docker.
I've downloaded the city_512 pretrained model and used cnn_freeze.py to save everything in ~/log
However I'm getting the error Unable to create network. Invalid yaml file ~/log/train.yaml
It seems like the line _cfg_train = YAML::LoadFile(path + "/train.yaml"); in bonnet_core, bonnet.cpp is failing for some reason.

My configs are:
cnn_cfg.yaml

model_path: "~/log"
device: "/gpu:0"
verbose: true
backend: "tf" # tf (tensorflow, cpu or gpu mode) or trt (TensorRT nvidia gpu)

topic_cfg.yaml

# inputs
image_topic: /camera/rgb/image_raw

# outputs
bgr_topic: /bonnet/bgr
mask_topic: /bonnet/mask
color_mask_topic: /bonnet/color_mask

ERROR:

started roslaunch server http://25.43.1.109:40363/

SUMMARY
========

PARAMETERS
 * /bonnet_node/backend: tf
 * /bonnet_node/bgr_topic: /bonnet/bgr
 * /bonnet_node/color_mask_topic: /bonnet/color_mask
 * /bonnet_node/device: /gpu:0
 * /bonnet_node/image_topic: /camera/rgb/image...
 * /bonnet_node/mask_topic: /bonnet/mask
 * /bonnet_node/model_path: ~/log
 * /bonnet_node/verbose: True
 * /rosdistro: kinetic
 * /rosversion: 1.12.13

NODES
  /
    bonnet_node (bonnet_run/bonnet_node)

ROS_MASTER_URI=http://25.43.1.106:11311

process[bonnet_node-1]: started with pid [16115]
[ INFO] [1536089505.974622159]: Successfully launched node.
Unable to create network. 
Invalid yaml file ~/log/train.yaml
[ERROR] [1536089505.974933092]: SOMETHING WENT WRONG INITIALIZING CNN. EXITING
[bonnet_node-1] process has died [pid 16115, exit code 1, cmd /home/developer/catkin/devel/lib/bonnet_run/bonnet_node __name:=bonnet_node __log:=/home/developer/.ros/log/63444740-b077-11e8-8d9b-00044b794498/bonnet_node-1.log].
log file: /home/developer/.ros/log/63444740-b077-11e8-8d9b-00044b794498/bonnet_node-1*.log

Thanks for your help.

error: no matching function for call to ‘nvuffparser::IUffParser

Hello,

I am attempting to use deploy_cpp with TensorRT backend.

I have installted TensorRT 4 succesfully,

when i run catkin_make, it throws the warning that i do not have tensorflow_cc, but that tensorRT is found. I am under the impression that should be enough?

The following is what i am seeing
/home/ubuntu/bonnet_ws/src/bonnet/deploy_cpp/src/lib/src/netTRT.cpp:113:56: error: no matching function for call to ‘nvuffparser::IUffParser::registerInput(const char*, nvinfer1::DimsCHW&)’
_parser->registerInput(_input_node.c_str(), inputDims);

tensorflow/core/common_runtime/direct_session.cc:167] Invalid argument: Could not parse entry in 'visible_device_list': '/gpu:0'. visible_device_list = /gpu:0

Hello,

I would like to express my thank you for this repository, incredible work from your university.

When i tried deploying the network, the initialization of the tensorflow session fails with:
tensorflow/core/common_runtime/direct_session.cc:167] Invalid argument: Could not parse entry in 'visible_device_list': '/gpu:0'. visible_device_list = /gpu:0

Now, i am able to succesfully run the ./session executable in the standalone directory, which gives the following output:

2018-04-22 16:18:29.689842: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-22 16:18:29.690138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
totalMemory: 10.91GiB freeMemory: 10.15GiB
2018-04-22 16:18:29.690150: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-04-22 16:18:29.852146: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9826 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Session successfully created.

I Solved this issue but commenting out the "tf::graph::SetDefaultDevice(_dev, &_graph_def);" line from line 94 in netTF.cpp, and rebuilding.

Now the network deploys in ROS, thankfully.

I am leaving this here in case someone else runs into this issue.

I am using tensorflow 1.4.0, Ubuntu 16.04, and the most recent bazel.

Cheers,
Jad

Inception model

Hi,
I'm doing my Master thesis with your Bonnet. I just want to ask what is the difference between your "inception-like model" and "original-inception model". What did you do to speed up the semantic segmentation? Thank you a lot

Best regards
Thanh

Using /gpu:1

Hi,
I'm currently building bonnet application in PX2, but I'm not using ROS. At the moment, I can install the bonnet. But when I run bonnet, there is the problem of GPU device.
In PX2, we have 2 gpu: iGPU and dGPU. The iGPU is the one only support fp16 and its locate in /gpu:1.
I just want to ask is there any way to make Bonnet using /gpu:1, because at the moment it only accpet for gpu:0.

Best regards
Thanh

MaxWorkSpace

Hey Andres,

In the NetTRT.cpp code, the line:

_builder->setMaxWorkspaceSize(1 << size);

I assume this means the amount of GPU mem allocated for the network? You use size 32. What does this mean in this context?

If i wanted to deploy two models (two .uff files), on one GPU, how do you recommend i proceed?

Issue in cnn_use_pb_tensorRT.py

After running the cnn_freeze script and getting access to the /tmp/frozen_model forder , I want to run the cnn_use_pb_tensorRT.py script but I got this error message: what is going wrong here?

[TensorRT] ERROR: UFFParser: Validator error: test_model/model/decoder/upsample/unpool3/inv-res-3/inverted_residual/conv/out/LeakyRelu: Unsupported operation _LeakyRelu
[TensorRT] ERROR: Failed to parse UFF model stream
File "/usr/lib/python2.7/dist-packages/tensorrt/legacy/utils/init.py", line 255, in uff_to_trt_engine
assert(parser.parse_buffer(stream, 0, network, model_datatype))
Traceback (most recent call last):
File "/home/pedram/Desktop/bonnet-master/cnn_use_pb_tensorRT.py", line 251, in
DATA_TYPE) # .HALF for fp16 in jetson!
File "/usr/lib/python2.7/dist-packages/tensorrt/legacy/utils/init.py", line 263, in uff_to_trt_engine
raise AssertionError('UFF parsing failed on line {} in statement {}'.format(line, text))
AssertionError: UFF parsing failed on line 255 in statement assert(parser.parse_buffer(stream, 0, network, model_datatype))

Any ideas what is going wrong here?

P.S: I am using:
Ubuntu 16.04
GPU: Nvidia 1050ti
Nvidia driver version: 384.130
Cuda: 9.0
Cudnn: 7
Python: 2.7
Tensroflow version: 1.13.0rc
TensorRT version: 5.0.2.6

nodes.yaml is missing?

This project is very attractive! But when I tried it, the nodes.yaml seems missing.
Is this file missed?

I'm using the pre-trained model: cityspaces 512x256 and 1024x512

log:
Successfully created log directory: log
Unable to create network.
Invalid yaml file city_512/nodes.yaml

Invalid yaml file persons_512/nodes.yaml

I have encountered the same problem:
log:
Successfully created log directory: log
Unable to create network.
Invalid yaml file persons_512/nodes.yaml

I have tried "./cnn_freeze.py -p /home/chen/log" to freeze the log folder,however another problem occured:
Traceback (most recent call last):
File "./cnn_freeze.py", line 32, in
import tensorflow as tf
ImportError: No module named 'tensorflow'

but I do not fully understand your answer, and do not know how to solve it. Can you answer in more details? Thank you very much!

Question about pretrianed model

Hi,
1.
I'd like to know the training details about your pretrained model on cityspace dataset.
Are all the hyper-parameters and training tricks inside the data.yaml, train.yaml, and net.yaml?
e.g.
mutil-scale training, data augment, ...
Can you show me the training loss graph?

As for other similiar semantic labeling dataset, e.g. raining, night, foggy scenes
If they follow the cityspace labeling format, Can I expand the training set and continue to train based on your per-trained model?

ImportError: No module named arch.abstract_net

I have executed ""python cnn_freeze.py -p /home/kumar/Experiments/city_512/"" but i landed with the error below

File "/home/kumar/Experiments/bonnet-master/train_py/arch/bonnet.py", line 31, in
from arch.abstract_net import AbstractNetwork
ImportError: No module named arch.abstract_net
Kindly help to solve the problem.

Steps for running on Movidius Neural Stick

Thanks for making this package available. Great work!

I saw on the TODO an item for running on the Movidius Neural Stick. Is there a breakdown of the work needed or an idea of the approach? Seems like a good idea to allow for running on smaller embedded machines.

Matt

Error in cnn_train.py: pywrap_tensorflow.list_devices() could not find GPU

Hi~ I was trying to run cnn_train.py following the instructions provided in ReadMe and encountered the problem shown as follows

meng@meng:~/foo/bar/bonnet/train_py$ ./cnn_train.py -d cfg/persons/data.yaml -n cfg/persons/net_bonnet_inception.yaml  -t cfg/persons/train_bonnet_inception.yaml -l cfg/persons/logs/
/usr/local/lib/python3.5/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
----------
INTERFACE:
data yaml:  cfg/persons/data.yaml
net yaml:  cfg/persons/net_bonnet_inception.yaml
train yaml:  cfg/persons/train_bonnet_inception.yaml
log dir cfg/persons/logs/
model path None
model type iou
----------

Commit hash (training version):  b'2b24767'
----------

Opening desired data file cfg/persons/data.yaml
Opening desired net file cfg/persons/net_bonnet_inception.yaml
Opening desired train file cfg/persons/train_bonnet_inception.yaml
Copying files to cfg/persons/logs/ for further reference.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Training from scratch
Fetching dataset
Training with 1 GPU's
Training with batch size 36
DEVICE AVAIL:  /device:CPU:0
Number of GPU's available is 0
Traceback (most recent call last):
  File "./cnn_train.py", line 186, in <module>
    net.train()
  File "/home/meng/foo/bar/bonnet/train_py/arch/abstract_net.py", line 1019, in train
    assert(self.n_gpus == self.n_gpus_avail)
AssertionError
meng@meng:~/foo/bar/bonnet/train_py$ 

I test the function pywrap_tensorflow.list_devices() used in device_lib.list_local_devices() for self.gpu_available() in console and found out it printed

[b'\n\r/device:CPU:0\x12\x03CPU \x80\x80\x80\x80\x01*\x001\xe1B:\\\\\x8bgf']

And after conversion, it became

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 8982061213019183087
]

It seemed that it could not find GPU. How can I fix this? Thanks~

Import Error cv2 using docker image

Hi, your repo seems very useful and interesting, I thank you for that.

However, I got the following error trying to make it segment an image:

Traceback (most recent call last):
  File "./cnn_use.py", line 30, in <module>
    import cv2
ImportError: /opt/ros/kinetic/lib/python2.7/dist-packages/cv2.so: undefined symbol: PyCObject_Type

I use the docker image as you explain in the main README.md and try the following comand:
~/bonnet_wrkdir/train_py$ ./cnn_use.py -p city_512/acc/ -i ../test.jpg
I had the following folder structure and got the -city_512 from your download link:

- deploy_cpp/
- train_py/
    - city_512/
    - test.jpg

What can I do to make it work?
Thanks for your help!

validation matrix

Hi,
I'm wondering if there's any confusion matrix output from either training or validation to see the per class accuracy, IoU, etc?

Running bonnet without GPU

In the README file, it says that the framework has been tested on a computer without GPU so I thought I'd give it a try. Unfortunately, the installation instructions using nvidia-docker only seem to work for computers with GPU. I've tried to install and run using vanilla docker which seemed to work but when try to run ./cnn_use.py -l /tmp/path/to/log/ -p /tmp/path/to/pretrained -i /path/to/image for example, I get the following error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

Could anybody help me out with the installation for a computer without GPU? I am using a MacBook Pro with Ubuntu 18.04.

ROS catkin build

test16@test16:~/bonnet/deploy_cpp$ catkin build

Profile: default
Extending: [env] /opt/ros/kinetic
Workspace: /home/test16/bonnet/deploy_cpp

Source Space: [exists] /home/test16/bonnet/deploy_cpp/src
Log Space: [exists] /home/test16/bonnet/deploy_cpp/logs
Build Space: [exists] /home/test16/bonnet/deploy_cpp/build
Devel Space: [exists] /home/test16/bonnet/deploy_cpp/devel
Install Space: [unused] /home/test16/bonnet/deploy_cpp/install
DESTDIR: [unused] None

Devel Space Layout: linked
Install Space Layout: None

Additional CMake Args: None
Additional Make Args: None
Additional catkin Make Args: None
Internal Make Job Server: True
Cache Job Environments: False

Whitelisted Packages: None
Blacklisted Packages: None

Workspace configuration appears valid.

[build] Found '3' packages in 0.0 seconds.
[build] Updating package table.
Warning: generated devel space setup files have been deleted.
Starting >>> catkin_tools_prebuild
Finished <<< catkin_tools_prebuild [ 5.2 seconds ]
Starting >>> bonnet_core


Warnings << bonnet_core:cmake /home/test16/bonnet/deploy_cpp/logs/bonnet_core/build.cmake.001.log
Build type: Release
CMake Warning at /home/test16/bonnet/deploy_cpp/src/lib/CMakeLists.txt:43 (find_package):
By not providing "FindTensorflowCC.cmake" in CMAKE_MODULE_PATH this project
has asked CMake to find a package configuration file provided by
"TensorflowCC", but CMake did not find one.

Could not find a package configuration file provided by "TensorflowCC" with
any of the following names:

TensorflowCCConfig.cmake
tensorflowcc-config.cmake

Add the installation prefix of "TensorflowCC" to CMAKE_PREFIX_PATH or set
"TensorflowCC_DIR" to a directory containing one of the above files. If
"TensorflowCC" provides a separate development package or SDK, be sure it
has been installed.

Tensorflow_cc shared library NOT FOUND
TF_AVAIL OFF

CUDA Libs: /usr/local/cuda/lib64/libcudart_static.a;-lpthread;dl;/usr/lib/x86_64-linux-gnu/librt.so
CUDA Headers: /usr/local/cuda/include
CUDA_AVAIL ON
TensorRT NOT Available
TRT_AVAIL OFF

OpenCV Libs: opencv_calib3d;opencv_core;opencv_dnn;opencv_features2d;opencv_flann;opencv_highgui;opencv_imgcodecs;opencv_imgproc;opencv_ml;opencv_objdetect;opencv_photo;opencv_shape;opencv_stitching;opencv_superres;opencv_video;opencv_videoio;opencv_videostab;opencv_viz;opencv_aruco;opencv_bgsegm;opencv_bioinspired;opencv_ccalib;opencv_cvv;opencv_datasets;opencv_dpm;opencv_face;opencv_fuzzy;opencv_hdf;opencv_img_hash;opencv_line_descriptor;opencv_optflow;opencv_phase_unwrapping;opencv_plot;opencv_reg;opencv_rgbd;opencv_saliency;opencv_stereo;opencv_structured_light;opencv_surface_matching;opencv_text;opencv_tracking;opencv_xfeatures2d;opencv_ximgproc;opencv_xobjdetect;opencv_xphoto
OpenCV Headers: /opt/ros/kinetic/include/opencv-3.3.1-dev;/opt/ros/kinetic/include/opencv-3.3.1-dev/opencv

YAML Libs: yaml-cpp
YAML Headers: /usr/lib/x86_64-linux-gnu/cmake/yaml-cpp/../../../../include
cd /home/test16/bonnet/deploy_cpp/build/bonnet_core; catkin build --get-env bonnet_core | catkin env -si /usr/bin/cmake /home/test16/bonnet/deploy_cpp/src/lib --no-warn-unused-cli -DCATKIN_DEVEL_PREFIX=/home/test16/bonnet/deploy_cpp/devel/.private/bonnet_core -DCMAKE_INSTALL_PREFIX=/home/test16/bonnet/deploy_cpp/install; cd -
......................................................................................................................................................................


Errors << bonnet_core:make /home/test16/bonnet/deploy_cpp/logs/bonnet_core/build.make.001.log
In file included from /home/test16/bonnet/deploy_cpp/src/lib/src/bonnet.cpp:19:0:
/home/test16/bonnet/deploy_cpp/src/lib/include/bonnet.hpp:33:2: error: #error ("At least TF OR TensorRT must be installed")
#error("At least TF OR TensorRT must be installed")
^
compilation terminated due to -Wfatal-errors.
make[2]: *** [CMakeFiles/bonnet_core.dir/src/bonnet.cpp.o] Error 1
make[2]: *** 正在等待未完成的任务....
make[1]: *** [CMakeFiles/bonnet_core.dir/all] Error 2
make: *** [all] Error 2
cd /home/test16/bonnet/deploy_cpp/build/bonnet_core; catkin build --get-env bonnet_core | catkin env -si /usr/bin/make --jobserver-fds=3,4 -j; cd -
......................................................................................................................................................................
Failed << bonnet_core:make [ Exited with code 2 ]
Failed <<< bonnet_core [ 7.2 seconds ]
Abandoned <<< bonnet_run [ Unrelated job failed ]
Abandoned <<< bonnet_standalone [ Unrelated job failed ]
[build] Summary: 1 of 4 packages succeeded.
[build] Ignored: None.
[build] Warnings: 1 packages succeeded with warnings.
[build] Abandoned: 2 packages were abandoned.
[build] Failed: 1 packages failed.
[build] Runtime: 12.5 seconds total.
Exception ignored in: <bound method BaseEventLoop.del of <_UnixSelectorEventLoop running=False closed=True debug=False>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/trollius/base_events.py", line 395, in del
File "/usr/local/lib/python3.5/dist-packages/trollius/unix_events.py", line 65, in close
File "/usr/local/lib/python3.5/dist-packages/trollius/unix_events.py", line 166, in remove_signal_handler
File "/usr/lib/python3.5/signal.py", line 47, in signal
TypeError: signal handler must be signal.SIG_IGN, signal.SIG_DFL, or a callable object

running docker image

Hi,

First, thanks for the software. Looks very cool.

I am using the docker image provided. But I have 2 questions about it.

After I read the dockerfile in the image, if I understood correctly, all dependencies are built in the image, but the actual bonnet is not installed inside the image. Is that correct? Because I could not find the lines correspond to installation of bonnet code. If so, do I have to install it by myself on top of the image?

I ran the helloworld.py under the \bonnet-docker folder of the image. Then I got Segmentation fault (core dumped) error. When I execute the code inside helloworld.py line by line, I come to know that it is the import tensorrt causing the error. Does it works fine on you machine?

Thanks for any help or suggestion.

Cheers,
Su

the output is not right

Hi,

I would ask if you can know the reason of why there are points on predication output

test_ids2_frame0216

the points like row and columens.

Thanks.

Retraining with Personal Data

I am trying to retrain with my own dataset, in the dataset/aux_script , it is mentioning to use "Use the output format extracted from the BAG that uses images and color labels created by Philipp's label creator."

Is this a kind of tool I should use first for my annotation?

[build] Error: Unable to find source space `/bonnet/src`

I run the following commands

$ cd bonnet/deploy_cpp
$ catkin init
$ catkin build bonnet_standalone

trying to build I get the following

Source Space: [missing] /bonnet/src
Log Space: [missing] /bonnet/logs
Build Space: [missing] /bonnet/build
Devel Space: [missing] /bonnet/devel
Install Space: [unused] /bonnet/install
DESTDIR: [unused] None

input_norm_and_resized_node vs input_node

Hello,

I am curious what the difference between the input_node and the input_norm_and_resized_node are?

I see in deploy_cpp that netTF.cpp utilizes input_node, and netTRT.cpp utilizes input_norm_and_resized_node.

CI cornercases

A test case should be added to check the build when none or only one of the back-ends are installed.

Request for brief explanation of training from scratch for CityScapes

Dear Andres,
can you provide a brief description of the steps for retraining the bonnet network for CityScapes from Scratch. I downloaded cityscapes data from their website, but now, I am not sure which scripts I need to run to prepare Cityscapes images and labels for input into the cnn_train.py.

Thanks.
Chino

How do I visualize the results coming from the pretrained model?

I tried to run the following command.

train_py$ ./cnn_freeze.py -i ~/Downloads/leftImg8bit/train/aachen/aachen_000001_000019_leftImg8bit.png -l ~/Desktop/log -p ~/Downloads/city_1024/

It saved a bunch of files in my ~/Desktop/log. Now how do I visualize the output masks?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.