openvinotoolkit / training_extensions Goto Github PK

Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™

Home Page: https://openvinotoolkit.github.io/training_extensions/

License: Apache License 2.0

Python 70.61% Shell 0.25% Dockerfile 0.05% Jupyter Notebook 29.10%

openvino computer-vision deep-learning pytorch neural-networks-compression quantization hyper-parameter-optimization image-classification image-segmentation object-detection

training_extensions's Introduction

OpenVINO™ Training Extensions

Key Features • Installation • Documentation • License

Introduction

OpenVINO™ Training Extensions is a low-code transfer learning framework for Computer Vision. The API & CLI commands of the framework allows users to train, infer, optimize and deploy models easily and quickly even with low expertise in the deep learning field. OpenVINO™ Training Extensions offers diverse combinations of model architectures, learning methods, and task types based on PyTorch and OpenVINO™ toolkit.

OpenVINO™ Training Extensions provides a "recipe" for every supported task type, which consolidates necessary information to build a model. Model templates are validated on various datasets and serve one-stop shop for obtaining the best models in general. If you are an experienced user, you can configure your own model based on torchvision, mmcv and OpenVINO Model Zoo (OMZ).

Furthermore, OpenVINO™ Training Extensions provides automatic configuration for ease of use. The framework will analyze your dataset and identify the most suitable model and figure out the best input size setting and other hyper-parameters. The development team is continuously extending this Auto-configuration functionalities to make training as simple as possible so that single CLI command can obtain accurate, efficient and robust models ready to be integrated into your project.

Key Features

OpenVINO™ Training Extensions supports the following computer vision tasks:

Classification, including multi-class, multi-label and hierarchical image classification tasks.
Object detection including rotated bounding box support
Semantic segmentation
Instance segmentation including tiling algorithm support
Action recognition including action classification and detection
Anomaly recognition tasks including anomaly classification, detection and segmentation
Visual Prompting tasks including segment anything model, zero-shot visual prompting

OpenVINO™ Training Extensions supports the following learning methods:

Supervised, incremental training, which includes class incremental scenario.

OpenVINO™ Training Extensions provides the following usability features:

Auto-configuration. OpenVINO™ Training Extensions analyzes provided dataset and selects the proper task and model to provide the best accuracy/speed trade-off.
Datumaro data frontend: OpenVINO™ Training Extensions supports the most common academic field dataset formats for each task. We are constantly working to extend supported formats to give more freedom of datasets format choice.
Distributed training to accelerate the training process when you have multiple GPUs
Mixed-precision training to save GPUs memory and use larger batch sizes
Integrated, efficient hyper-parameter optimization module (HPO). Through dataset proxy and built-in hyper-parameter optimizer, you can get much faster hyper-parameter optimization compared to other off-the-shelf tools. The hyperparameter optimization is dynamically scheduled based on your resource budget.

Installation

Please refer to the installation guide. If you want to make changes to the library, then a local installation is recommended.

Install from PyPI

Installing the library with pip is the easiest way to get started with otx.

pip install otx

This will install OTX CLI. OTX requires torch and lightning by default to provide training. To use the full pipeline, you need the commands below:

# Get help for the installation arguments
otx install -h

# Install the full package
otx install

# Install with verbose output
otx install -v

# Install with docs option only.
otx install --option docs

Install from source

To install from source, you need to clone the repository and install the library using pip via editable mode.

# Use of virtual environment is highy recommended
# Using conda
yes | conda create -n otx_env python=3.10
conda activate otx_env

# Or using your favorite virtual environment
# ...

# Clone the repository and install in editable mode
git clone https://github.com/openvinotoolkit/training_extensions.git
cd training_extensions
pip install -e .

This will install OTX CLI. OTX requires torch and lightning by default to provide training. To use the full pipeline, you need the commands below:

# Get help for the installation arguments
otx install -h

# Install the full package
otx install

# Install with verbose output
otx install -v

# Install with docs option only.
otx install --option docs

Quick-Start

OpenVINO™ Training Extensions supports both API and CLI-based training. The API is more flexible and allows for more customization, while the CLI training utilizes command line interfaces, and might be easier for those who would like to use OpenVINO™ Training Extensions off-the-shelf.

For the CLI, the commands below provide subcommands, how to use each subcommand, and more:

# See available subcommands
otx --help

# Print help messages from the train subcommand
otx train --help

You can find details with examples in the CLI Guide. and API Quick-Guide.

Below is how to train with auto-configuration, which is provided to users with datasets and tasks:

Training via API

# Training with Auto-Configuration via Engine
from otx.engine import Engine

engine = Engine(data_root="data/wgisd", task="DETECTION")
engine.train()

For more examples, see documentation: CLI Guide

Training via CLI

otx train --data_root data/wgisd --task DETECTION

For more examples, see documentation: API Quick-Guide

In addition to the examples above, please refer to the documentation for tutorials on using custom models, training parameter overrides, and tutorial per task types, etc.

Updates

v2.0.0 (1Q24)

TBD

Release History

Please refer to the CHANGELOG.md

Branches

develop
- Mainly maintained branch for developing new features for the future release
misc
- Previously developed models can be found on this branch

License

OpenVINO™ Toolkit is licensed under Apache License Version 2.0. By contributing to the project, you agree to the license and copyright terms therein and release your contribution under these terms.

Issues / Discussions

Please use Issues tab for your bug reporting, feature requesting, or any questions.

Known limitations

misc branch contains training, evaluation, and export scripts for models based on TensorFlow and PyTorch. These scripts are not ready for production. They are exploratory and have not been validated.

Disclaimer

Intel is committed to respecting human rights and avoiding complicity in human rights abuses. See Intel's Global Human Rights Principles. Intel's products and software are intended only to be used in applications that do not cause or contribute to a violation of an internationally recognized human right.

Contributing

For those who would like to contribute to the library, see CONTRIBUTING.md for details.

Thank you! we appreciate your support!

training_extensions's People

Contributors

Stargazers

Watchers

Forkers

malikaoudjif hansenchen2011 zingorn ailibrary meedmakhlouf snosov1 xuyunlongshanhu yates1949 tony109060581 undeadinu zinovievvladimir mustafaxfe mattcurrycom yuhonghong7035 lnhieuvn yeknafar lanybass jijunwei cvrocker alexanderdokuchaev lintrees kennethchn grib0ed0v daniil-osokin elenash tienduchoang chenqi1997 daydreamer2023 stefanruan tintn vadimadr lovenan95 alexkoff88 dmitriysidnev druzhkov-paul ikostina ilya-krylov hsulin0806 pskiran1 feolcn caichangqi pandinosaurus finniu flavio58it zpg1995 xiaogangli xuhuaze707313 rena-ganba hust-wayne ai-is-light anjingx anjingxing visionzq leo-xxx salt-fly jayle19930918 kar98kbang fengyen-chang evgeny-izutov czero69 lachinov yds5817 recardodk niexiaokun fujiehuang m-decoster tongni1975 mahinlma xiaoyubing jhvics1 jinyanghe sovrasov mryzd xzry6 afeizai adithyaprem shiyuan0806 maodong2056 garrach fake-bba zzzhacker leonidbeynenson niuwenju jeonsoyun winnerineast buptdbj yunhengzi diversity-ai grady1006 viwoqu smartwell hajungong007 waiyannyeinnaing jeffw99 tspannhw fx4758 shixinlishixinli sheqi paulpanwang shkr

training_extensions's Issues

Error when convert onnx to openvino

I'm running Global Context for Convolutional Pose Machines experiment. I trained according to the author's instructions. The process of converting to onnx format also has no errors. However, when converting from onnx to openvino format, the following error occurs:

Unexpected exception happened during extracting attributes for node 386.
Original exception message: Upsample mode bilinear for node 387 is not supported. Only nearest is supported.

%387 : Float(1, 290, 16, 16) = onnx::Upsample[mode="bilinear"](%385, %386), scope: SinglePersonPoseEstimationWithMobileNet/RefinementStage/Sequential[trunk]/UShapedContextBlock[0]

I think the error is in the following code:
d2 = self.decoder2(torch.cat([e1, F.interpolate(e2, size=(16, 16), mode='bilinear', align_corners=False)], 1))
d1 = self.decoder1(torch.cat([x, F.interpolate(d2, size=(32, 32), mode='bilinear', align_corners=False)], 1))

I changed mode to nearest but Pytorch support only: linear, bilinear, trilinear.

ERROR:tensorflow:Model diverged with loss = NaN

Hi supporters,

I intent to train a face detection model using training_toolbox_tensorflow and WIDER face dataset which are converted to COCO format. The config.py is as same as vlp/config.py, and I changed annotation_path to WIDER face path.
But when run train.py, i am facing with a problem "ERROR:tensorflow:Model diverged with loss = NaN"
I searched in the stackoverflow, tensorflow and following their instructions but still failed.

Did you meet the same problem before ?
Do you know how to overcome this issue ?

Thank you in advance !

Regards,
Michael

I have searched the issues of this repository and believe that this is not a duplicate.

Expected Behavior

If you're describing a bug, tell us what should happen.
If you're suggesting a change/improvement, tell us how it should work.

Current Behavior

If describing a bug, tell us what happens instead of the expected behavior.
If suggesting a change/improvement, explain the current behavior.

Steps to Reproduce (for bugs)

Convert Wider face to COCO format (I use tools in this repo :https://github.com/the-house-of-black-and-white/morghulis)
add full image path for above COCO format
Change annotation_path in vlp/config.py to Wider face COCO format
start training "python3 tools/train.py vlp/config.py"

Context

How has this issue affected you? What are you trying to accomplish?
Providing context helps us come up with a solution that is most useful in the real world.

Your Environment

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
TensorFlow installed from (source or binary): Conda
TensorFlow version (use command below): 1.12.0

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Python version: 3.6.8
CUDA/cuDNN version: 7.6.0 build cuda9.0_0
GPU model and memory: nvidia geforce gtx 1050 4G, NVIDIA-SMI 390.116

Include as many relevant details about the environment with which you experienced the bug.

pretrained model is not match for ssd

I've downloaded the model weight(MobileNet v2 0.35 256x256.) from the readme file of ssd_detector, but it dosen't work for infering or training.

I have searched the issues of this repository and believe that this is not a duplicate.

Expected Behavior

If you're describing a bug, tell us what should happen.
If you're suggesting a change/improvement, tell us how it should work.

I ran train.py or infer.py, turn out an error:

    saver.restore(sess, checkpoint_filename_with_path)
  File "d:\ProgramData\Anaconda3\envs\tfg12\lib\site-packages\tensorflow\python\training\saver.py", line 1562, in restore
    err, "a Variable name or other graph key that is missing")
tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key global_step not found in checkpoint
         [[node save/RestoreV2 (defined at infer_ssd.py:105)  = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
         [[{{node save/RestoreV2/_79}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_84_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

Caused by op 'save/RestoreV2', defined at:
...........

key: Key global_step not found in checkpoint
I've no idea how to rescue.

Current Behavior

If describing a bug, tell us what happens instead of the expected behavior.
If suggesting a change/improvement, explain the current behavior.

Steps to Reproduce (for bugs)

1.download the weight,unzip to vlp
2.change the filename in checkpoint file in model dir
3.run python infer.py --video --input E:\a.mp4 vlp\config.py
4.bingo~ the error come out

Context

How has this issue affected you? What are you trying to accomplish?
Providing context helps us come up with a solution that is most useful in the real world.

Your Environment

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):windows 10
TensorFlow installed from (source or binary):binary
TensorFlow version (use command below):b'unknown' 1.12.0

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Python version:3.6.8
CUDA/cuDNN version: 10.0.130 / 7.3.1
GPU model and memory: 2080TI 11G

Include as many relevant details about the environment with which you experienced the bug.

How to prepare my own data-set to train using training_toolbox_tensorflow

I have searched the issues of this repository and believe that this is not a duplicate.

Expected Behavior

I want to train my own data-set using training_toolbox_tensorflow, but preparing data I don't know what tools to use to create annotation files for the COCO format. I using [this ](https://github.com/tzutalin/labelImg)labeling tools creating annotation file TensorFlow Object Detection API, but the output format is not JSON as COCO dataset required.

So could you please clarify, how to prepare dataset, step by step? For example TF Object Detection API:
collecting images -> labeling image -> convert XML to SCV -> create TF_Record -> start train -> export

And also does training_toolbox_tensorflow support to train on windows?

Thanks!

Your Environment

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 64bit
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 1.12
Python version: 3.6.8
CUDA/cuDNN version: 9.0
GPU model and memory: GTX 1080 8GB

Add pylint checks during CI test

We have devtools folder with config for pylint, so it will be great to perform all checks during CI tests.

Can i train the number plate in two lines?

If I like to train number plate with two lines, it is possible to train using the same approach. Or where should i modify?

No module named 'spatial_transformer'

while runing the train.py file i get the following error:
ModuleNotFoundError: No module named 'spatial_transformer'

what am i missing?

Failed to convert pre-trained action recognition model to ONNX format

I'd like to convert the pre-trained action recognition model to OpenVINO IR files. So I followed the instructions at https://github.com/opencv/openvino_training_extensions/tree/develop/pytorch_toolkit/action_recognition. In the first step that convert the model to ONNX, I saw below error message:

Traceback (most recent call last):
File "main.py", line 385, in
main()
File "main.py", line 330, in main
export_onnx(args, model, args.onnx)
File "main.py", line 36, in export_onnx
model = model.module
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 539, in getattr
type(self).name, name))
AttributeError: 'VideoTransformer' object has no attribute 'module'

Steps to Reproduce (for bugs)

git clone https://github.com/opencv/openvino_training_extensions.git
https://download.01.org/opencv/openvino_training_extensions/models/action_recognition/resnet_34_vtn_rgbd_kinetics.pth
Under directory openvino_training_extensions/pytorch_toolkit/action_recognition, run python3 main.py --model resnet34_vtn_rgbdiff --clip-size 16 --st 2 --pretrain-path ~/resnet_34_vtn_rgbd_kinetics.pth --onnx resnet34_vtn_rgbd.onnx

I want to convert the pre-trained decoder and encoder model to OpenVINO IR files. Like https://github.com/opencv/open_model_zoo/blob/master/intel_models/action-recognition-0001-encoder/description/action-recognition-0001-encoder.md and https://github.com/opencv/open_model_zoo/blob/master/intel_models/action-recognition-0001-decoder/

Your Environment

Linux Ubuntu 16.04:
TensorFlow installed from binary
TensorFlow version (use command below): v1.13.1-0-g6612da8951

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Python version: Python 3.5.2
torch version: 1.1.0
CUDA/cuDNN version: not available
GPU model and memory: Intel Coffeelake GT3

tensorboard is incompatible.

Do the steps to setup, the tensorboard is incompatible.

Current Behavior

When I wanna create the virtual environment by provided step(bash tools/init_venv), the tensorboard is incompatible. But I thought I use the right tensorboard version.

Steps to Reproduce (for bugs)

1.run "~/training_toolbox_tensorflow$ bash tools/init_venv.sh"
2.tensorflow-gpu 1.10.0 has requirement tensorboard<1.11.0,>=1.10.0, but you'll have tensorboard 1.12.2 which is incompatible.
3.When I check the pip list:
tensorboard 1.10.0
tensorflow 1.10.0
tensorflow-gpu 1.10.0
tensorflow-tensorboard 1.5.1

Your Environment

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):16.04
TensorFlow installed from (source or binary):1.10.0
TensorFlow version (use command below):1.10.0

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Python version:2.7/3.5
GPU model and memory:

Poor performance of lprNet inference for unaligned plate images (plate images with margin)

I successfully trained a model of LPRNet plate recognition by using this tutorial, on my own plate dataset. The images in my dataset are tight aligned crop and have no margin. Inference result of my trained model on plate images without any margin is very very good. But, when i pass an plate _ with margin_ to the model, I receive a strange result, including 20 characters, whereas my plate has only 8 characters. How can I resolve this issue? Do I need to train my model with plate images with margin? In this case, I encounter another problem. The size of margins in plate images, changes from one image to another, meaning that, the margins don't have a fixed size. How can i get a good result for plate images with margins? any recommendations would be appreciated.

Dealing with a big dataset in LPRNet.

Hi.
I would like to train the LPRNet with a dataset of the size 1.3 million images of (94,24) plates. I have 3 questions:

Is it reasonable to use a dataset with this huge number of plates? Is there a maximum number for the size of datasets for LPRNet?
The default values of data augmentation in config file are:

apply_basic_aug = False
apply_stn_aug = True

Now, with 1.3 million images, do I need to use apply_basic_aug = TRUE?

How should I change parameters steps = 250000 and learning_rate = 0.001 in order to fit with the 1.3 million number of images? Do I need to increase the train steps to a higher value like steps = 450000 for example? In the LPRNet paper, the authors mentioned that they drop the learning rate after every 100k iterations by a factor of 10 and train the network for 250k iterations in total. But I cant find the line for setting the decay factor 10 in config file. Could you please guide me how to set hyperparameters steps and learning_rate for 1.3 million images dataset? The defaults are:

batch_size = 32
steps = 250000
learning_rate = 0.001
grad_noise_scale = 0.001
opt_type = 'Adam'
Thanks a lot.

How to export FP16 model by export.py

[checked ] I have searched the issues of this repository and believe that this is not a duplicate.

Expected Behavior

It is work on GPU or Movidius with FP16 Model

Current Behavior

 The result text  display alot of zero，e.g“Y000000000000000000000000000000000000000000000000000000000000000000000000000000000000000” .

Steps to Reproduce (for bugs)

export the FP16 model
change the source of export.py line 55 and 86 from FP32 to FP16
get the bin and xml file
put the file to the correct place.

Context

I can not get the right result .

Your Environment

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):ubuntu 16.04
TensorFlow installed from (source or binary):binary
TensorFlow version (use command below):1.10.0

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Python version:3.5
CUDA/cuDNN version:9.0 7
GPU model and memory:

Include as many relevant details about the environment with which you experienced the bug.

implementing transfer learning

Hi thanks for your sharing your code.
i would like to know how i can implement transfer learning on your model and use it to detect plates of another country and language.
pointing out that i don't want to start training from scratch and want to only train the last layers on my dataset so i won't need a large dataset.

hang on training lprnet

https://github.com/opencv/training_toolbox_tensorflow/blob/42b9b47f889cf1fbb26dc8e2534ef42be85d59cf/training_toolbox/lpr/train.py#L92
It always hang on this line after few iterations without any error.Just like it's waiting for data.
the batch_size is bigger the more iterations it works.

INFO:tensorflow:Iteration: 4841, Train loss: 1.0368348  
INFO:tensorflow:Iteration: 4851, Train loss: 0.8683023  
INFO:tensorflow:Iteration: 4861, Train loss: 0.76075464  
INFO:tensorflow:Iteration: 4871, Train loss: 0.94428396  
INFO:tensorflow:Iteration: 4881, Train loss: 0.88550854  
INFO:tensorflow:Iteration: 4891, Train loss: 0.8643252  
INFO:tensorflow:Iteration: 4901, Train loss: 0.65195805  
INFO:tensorflow:Iteration: 4911, Train loss: 0.83273506  
INFO:tensorflow:Iteration: 4921, Train loss: 0.93929297  
INFO:tensorflow:Iteration: 4931, Train loss: 0.9036747  
INFO:tensorflow:Iteration: 4941, Train loss: 0.5995865  
INFO:tensorflow:Iteration: 4951, Train loss: 0.64076924  
INFO:tensorflow:Iteration: 4961, Train loss: 0.93964803  
INFO:tensorflow:Iteration: 4971, Train loss: 0.89070266  
_

cursor blinking, GPU 0%

Can I use transfer Learning for LPRNet using Chinese LP trained checkpoint/.pb file

Do the LPRnet model support Transfer learning, if so then how can I do if I have checksum for Chinese Licenceplate

python infer.py vlp/config.py --video --input=./1.mp4 --show

Unexpected Error When running instance_segmentation

Traceback (most recent call last):
File "tools/train.py", line 215, in
main(args)
File "tools/train.py", line 177, in main
train_tool.model = locate(args.model)(train_dataset.classes_num)
File "/usr/lib/python3.5/pydoc.py", line 1575, in locate
nextmodule = safeimport('.'.join(parts[:n+1]), forceload)
File "/usr/lib/python3.5/pydoc.py", line 350, in safeimport
raise ErrorDuringImport(path, sys.exc_info())
pydoc.ErrorDuringImport: problem in segmentoly.rcnn.model_zoo.resnet_fpn_mask_rcnn - ImportError: No module named 'segmentoly.extensions._EXTRA'

cannot infer using .pb file

Hi @snosov1 I want to infer using with .pb file instead of using ckpt (in infer.py).
My code is:

 height, width, channels_num = config.input_shape
  rnn_cells_num = config.rnn_cells_num

  # graph = tf.Graph()
  
  graph = load_graph('/data/training_toolbox_tensorflow/training_toolbox/lpr/chinese_lp/model/ie_model/graph.pb.frozen')
  with graph.as_default():
    with slim.arg_scope([slim.batch_norm, slim.dropout], is_training=False):
      inp_data, filenames = data_input(height, width, channels_num, config.infer.file_list_path,
                                       batch_size=config.infer.batch_size)
      print("inp_data: ", inp_data)
      print("filenames: ", filenames)

      prob = inference(rnn_cells_num, inp_data, config.num_classes)
      prob = tf.transpose(prob, (1, 0, 2))  # prepare for CTC

      data_length = tf.fill([tf.shape(prob)[1]], tf.shape(prob)[0])  # input seq length, batch size

      result = tf.nn.ctc_greedy_decoder(prob, data_length, merge_repeated=True)

      predictions = tf.to_int32(result[0][0])
      d_predictions = tf.sparse_to_dense(predictions.indices,
                                         [tf.shape(inp_data, out_type=tf.int64)[0], config.max_lp_length],
                                         predictions.values, default_value=-1, name='d_predictions')

      # init = tf.initialize_all_variables()
      # saver = tf.train.Saver(write_version=tf.train.SaverDef.V2)

  # # session
  # conf = tf.ConfigProto()
  # if hasattr(config.eval.execution, 'per_process_gpu_memory_fraction'):
  #   conf.gpu_options.per_process_gpu_memory_fraction = config.train.execution.per_process_gpu_memory_fraction
  # if hasattr(config.eval.execution, 'allow_growth'):
  #   conf.gpu_options.allow_growth = config.train.execution.allow_growth

  # sess = tf.Session(graph=graph, config=conf)
  # coord = tf.train.Coordinator()
  # threads = tf.train.start_queue_runners(sess=sess, coord=coord)

  
  with tf.Session(graph=graph) as sess:
    vals, filenames = sess.run([d_predictions, filenames])
  print("vals: ", vals)

but i got error:

Caused by op 'Conv_15/biases/read', defined at:
  File "infer.py", line 165, in <module>
    tf.app.run(main)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "infer.py", line 160, in main
    infer(cfg)
  File "infer.py", line 79, in infer
    prob = inference(rnn_cells_num, inp_data, config.num_classes)
  File "/data/training_toolbox_tensorflow/training_toolbox/lpr/trainer.py", line 120, in inference
    activation_fn=None)  # fully convolutional linear activation

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value Conv_15/biases
         [[node Conv_15/biases/read (defined at /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/framework/python/ops/variables.py:277)  = Identity[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Conv_15/biases)]]

How i can fix it? thank for your help

Evaluation on windows machine is not working

A Faster RCNN Resnet 101 model is trained using tensorflow framework on Linux machine which is then converted into Intermediate Representation (i.e. .xml, .bin and .mapping files). Evaluation of test images on the same Linux machine is working fine by specifying cpu extension to library libcpu_extension_sse4.so but when tried to evaluate the same test images on a windows machine using CPU extension as MKLDNNPlugin.dll library it is giving the attached error.

Command used to evaluate images on Windows machine:

python object_detection_inference_frcnn.py --model <\Path-to-IR >frozen_inference_graph.xml --input <\Path-to-test images>\test_images --cpu_extension C:\Intel\computer_vision_sdk_2018.4.420\deployment_tools\inference_engine\bin\intel64\Release\MKLDNNPlugin.dll

Expected Behavior

After providing CPU extension to MKLDNNPlugin.dll library for windows platform, it should evaluate images properly.

Current Behavior

After providing CPU extension to MKLDNNPlugin.dll library for windows platform, it is giving the following error:
C:\Users\an357110\Downloads\OpenVINO>python object_detection_inference_frcnn.py --model C:\Users\an357110\Downloads\OpenVINO\ir_model_22103_images_augmented_phase1\frozen_inference_graph.xml --input C:\Users\an357110\Downloads\OpenVINO\test_dataset\test_images --cpu_extension C:\Intel\computer_vision_sdk_2018.4.420\deployment_tools\inference_engine\bin\intel64\Release\MKLDNNPlugin.dll

Traceback (most recent call last):
File "object_detection_inference_frcnn.py", line 183, in
sys.exit(main() or 0)
File "object_detection_inference_frcnn.py", line 62, in main
plugin.add_cpu_extension(args.cpu_extension)
File "ie_api.pyx", line 360, in inference_engine.ie_api.IEPlugin.add_cpu_extension
File "ie_api.pyx", line 364, in inference_engine.ie_api.IEPlugin.add_cpu_extension
RuntimeError: GetProcAddress cannot locate method 'CreateExtension': 127

Steps to Reproduce

Train a Faster RCNN Resnet 101 model using tensorflow framework on a Linux machine.
Optimize the frozen model which will create Intermediate Representation
Transfer the IR files to a Windows machine.
set PATH variable to the required MKLDNNPlugin.dll library along with location of C++ header files.
Evaluate images on the Windows machine.

Environment setup and system configuration:

Processor: Intel core i7-8700 CPU @3.2 GHz
Ram: 32 GB
System type: 64 bit Windows 10
Softwares: Anaconda distribution along with required additional packages, OpenVINO toolkit version 4, Cmake, Visual Studio.

Please help me in resolving this issue.

How to export ssd-mobilenet-v2 pytorch model to onnx/tensorflow/caffe format.

Hi Supporter,

I've trained a face detection model which followed this instruction :https://github.com/opencv/openvino_training_extensions/blob/develop/pytorch_toolkit/object_detection/face_detection.md

I dumped to pkl format and export to onnx format by using pytorch:

# dummy_input = "epoch_58.pth"
dummy_input = "result.pkl" 
# Obtain your model, it can be also constructed in your script explicitly
model = torchvision.models.mobilenet_v2(pretrained=True)
# Invoke export
torch.onnx.export(model, dummy_input, "test.onnx")

But got the error : RuntimeError: Only tuples, lists and Variables supported as JIT inputs, but got str

Could you help me to export the pytorch to onnx or tensorflow or caffe format ?

The reason I want to export to tensorflow/caffe format are :

Because current OpenCV version don't support Pytorch model
So export it to tensorflow/caffe to load this model using cv::dnn::readNetFromTensorflow and cv::dnn::readNetFromCaffe APIs

Additionally, I exported this model to OpenVino format (.xml, .bin), and it worked well on Movidius.

Thank you in advance !

Regards,
Michael

Why python script doesn't print to console or can't debug using pdb in Ubuntu

I am looking into the code.

For training lpr, we can use train.py in lpr folder.

train.py uses methods and classes in trainer.py, such as CTCUtils, InputData, inference and LPRVocab.

I put print inside LPRVocab to see how the code works as follows.

class LPRVocab:
  @staticmethod
  def create_vocab(train_list_path, val_list_path, use_h_concat=False, use_oi_concat=False):
    print('create_vocab called ')
    [vocab, r_vocab, num_classes] = LPRVocab._create_standard_vocabs(train_list_path, val_list_path)
    if use_h_concat:
      [vocab, r_vocab, num_classes] = LPRVocab._concat_all_hieroglyphs(vocab, r_vocab)
    if use_oi_concat:
      [vocab, r_vocab, num_classes] = LPRVocab._concat_oi(vocab, r_vocab)

    return vocab, r_vocab, num_classes

  @staticmethod
  def _char_range(char1, char2):
    """Generates the characters from `char1` to `char2`, inclusive."""
    for char_code in range(ord(char1), ord(char2) + 1):
      yield chr(char_code)

  # Function for reading special symbols
  @staticmethod
  def _read_specials(filepath):
    characters = set()
    with open(filepath, 'r') as file_:
      for line in file_:
        current_label = line.split(' ')[-1].strip()
        characters = characters.union(re.findall('(<[^>]*>|.)', current_label))
    return characters

  @staticmethod
  def _create_standard_vocabs(train_list_path, val_list_path):
    print('_create_standard_vocabs called ')
    chars = set().union(LPRVocab._char_range('A', 'Z')).union(LPRVocab._char_range('0', '9'))
    print(chars)
    print('for special characters')
    chars = chars.union(LPRVocab._read_specials(train_list_path)).union(LPRVocab._read_specials(val_list_path))
    print(chars)
    print('for list characters')
    chars = list(chars)
    print(chars)
    print('for sort characters')
    chars.sort()
    print(chars)
    print('for append characters')
    chars.append('_')
    print(chars)    
    num_classes = len(chars)
    print('num_classes '+str(num_classes))
    vocab = dict(zip(chars, range(num_classes)))
    print('vocab ')
    print(vocab)
    r_vocab = dict(zip(range(num_classes), chars))
    r_vocab[-1] = ''
    print('r_vocab ')
    print(r_vocab)
    return [vocab, r_vocab, num_classes]

But I don't see any prints to console.

Then I used

python -m pdb train.py

then set break point inside trainer.py. Break points are never hit. Press Key S also doesn't make to go detail inside another files.

Why debug desn't work and print doesn't print to console? I used python3.5.

where is spatial_transformer

https://github.com/opencv/training_toolbox_tensorflow/blob/734bfc649cbfced7ecb0bde73a1645a5cb9ad418/training_toolbox/lpr/trainer.py#L7

link is not working!

link that is described in project is not working: https://download.01.org/openvinotoolkit/open_model_zoo/training_toolbox_pytorch/models/hpe/checkpoint_iter_370000.pth
"The requested URL was not found on this server."

I have searched the issues of this repository and believe that this is not a duplicate.

Expected Behavior

If you're describing a bug, tell us what should happen.
If you're suggesting a change/improvement, tell us how it should work.

Current Behavior

If describing a bug, tell us what happens instead of the expected behavior.
If suggesting a change/improvement, explain the current behavior.

Steps to Reproduce (for bugs)

Context

How has this issue affected you? What are you trying to accomplish?
Providing context helps us come up with a solution that is most useful in the real world.

Your Environment

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
TensorFlow installed from (source or binary):
TensorFlow version (use command below):

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Python version:
CUDA/cuDNN version:
GPU model and memory:

Include as many relevant details about the environment with which you experienced the bug.

Unexpected Error log when running nosetests in https://github.com/opencv/training_toolbox_tensorflow

After setting up and running nosetests, there would be below error log printing out:
2018-12-03 07:29:03.768323: E tensorflow/stream_executor/cuda/cuda_driver.cc:397] failed call to cuInit: CUDA_ERROR_NO_DEVICE

It seems that GPU is not probed successfully. Tried many ways and can't make the error log disappearing.
BTW, I can run the COCO training with GPU as the steps of training_toolbox_tensorflow.

I have searched the issues of this repository and believe that this is not a duplicate.

Expected Behavior

No error log and Nvidia GPU can be probed automatically.

Current Behavior

Print error log and can't probe the Nvidia GPU after tries.

Steps to Reproduce (for bugs)

Just follow the steps of https://github.com/opencv/training_toolbox_tensorflow. run the "nosetests"

Context

How has this issue affected you? What are you trying to accomplish?
Providing context helps us come up with a solution that is most useful in the real world.

Your Environment

Ubuntu 16.04 with Cuda 9.0. nvidia-smi can also work.

SUPERRESOLUTIONS IR NOT WORKING

Hello,

I'm trying to use the superresolution model. But when running the infer_ie.py file, the final image is a big black rectangle.

Also this warning appears

/usr/local/lib/python3.5/dist-packages/skimage/util/dtype.py:135: UserWarning: Possible precision loss when converting from float32 to uint8

nothing from the file has being changed except a sorted in load_ir_model() because i check that the main uses de inputs in order.

SO: ubuntu 16.04.04
python version: 3.5.2
scikit-image==0.15.0
opencv-openvino: 4.1.0-openvino

Add cpu/gpu key for init_venv.sh

Right now tools/init_venv.sh installs default python packages and we have a crutch for Travis CI environment. I suggest adding keys cpu and gpu for init_venv.sh.
For example: init_venv.sh [--tf_target=cpu|gpu]

Unable to optimize pretrained .pb model trained with tensorflow framework

I have trained a FasterRCNN inception V2 model using tensorflow framework but while optimizing it using mo.py script, it is asking to specify output port explicitly.

Steps to Reproduce (for bugs)

Train a FasterRCNN object detection model using tensorflow framework.
Execute model optimizer script by providing relevant arguments.
Command used was:
python mo.py --framework tf --input_model "E:\exp\model\frozen_inference_graph.pb" --output_dir "E:\exp\model" --data_type FP32 --log_level ERROR --mean_value [127.5,127.5,127.5] --scale 127.5 --input_shape [1,1,1,3]

I am working on a windows machine with Intel i7-8700 processor with 3.2 GHz speed. Please refer the following snap for more details and suggest me steps to successfully optimize the frozen model and convert it into intermediate representation format (.xml & .bin files) of Intel's OpenVINO toolkit.

Thanks in advance.

While optimizing the frozen graph to IE using export.py in LPR below error occurs:[ ERROR ] Cannot infer shapes or values for node "CTCGreedyDecoder". [ ERROR ] index 1 is out of bounds for axis 0 with size 1 [ ERROR ] [ ERROR ] It can happen due to bug in custom shape infer function <function CTCGreedyDecoderOp.ctc_greedy_decoder_infer at 0x7f1bed43ec80>. [ ERROR ] Or because the node inputs have incorrect values/shapes. [ ERROR ] Or because input shapes are incorrect (embedded to the model or passed via --input_shape).

I have searched the issues of this repository and believe that this is not a duplicate.

Expected Behavior

If you're describing a bug, tell us what should happen.
If you're suggesting a change/improvement, tell us how it should work.

Current Behavior

If describing a bug, tell us what happens instead of the expected behavior.
If suggesting a change/improvement, explain the current behavior.

Steps to Reproduce (for bugs)

Context

How has this issue affected you? What are you trying to accomplish?
Providing context helps us come up with a solution that is most useful in the real world.

Your Environment

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
TensorFlow installed from (source or binary):
TensorFlow version (use command below):

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Python version:
CUDA/cuDNN version:
GPU model and memory:

Include as many relevant details about the environment with which you experienced the bug.

ImportError: libcublas.so.9.0

When trying to run nosetests in venv as described following error occurs;
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

when trying to run import tensorflow as tf as a separate python command the same error returns. Seems this has something to do with CPU/GPU version of tensorflow?!

not able to get "output_node_names" from trained lpr model

Trained lpr model with NVDIA GPU. Using checkpoint & meta data generated .pb.
When I tried summarize graph with .pb received.
Not able to get "Output_node_names".

To freeze .pb, I need output_node_names.
Please provide solution to fix this

Action Recognition -- Separating the encoder and the decoder

The Inference Engine (IE) action recognition sample takes separated encoder and decoder pre-trained models as input arguments. On the other hand, when we export the Pytorch model using the instructions provided, we get a single encoder-decoder architecture. I have two main questions:

Is there any way to export a Pytorch model into Openvino encoder model and Openvino decoder model separately so that they can be plugged into the action recognition IE sample?
Is there any way to train just the decoder part of the architecture?

Unexpected Error When running python3.6 train_single.py --dataset-folder <LIP_HOME> --checkpoint-path mobilenet_sgd_68.848.pth.tar --from-mobilenet

When I followed the guide :

In the second step, error occurred:

And I noticed the place where the error occurred:

However, there are not 'optimizer', 'scheduler' and ... in checkpoint

##Environment

Linux Ubuntu 16.04
Python version:3.6.7
Pytorch version:0.4.1
CUDA/cuDNN version:

Can you help me?@Daniil-Osokin @AlexanderDokuchaev @grib0ed0v @snosov1 Very thankful!

Why not the eval.py print the final result instead of the "||||||......."

It's very weird when the eval.py is done. it still print the "|||||......".
https://github.com/opencv/training_toolbox_tensorflow/blob/develop/training_toolbox/ssd_detector/eval.py#L171

Could you share how tow to train/val dataset LIP in detail?

@AlexanderDokuchaev

Don't know how to inference

hello @snosov1 @alexey-trushkov @alexey-sidnev AlexanderDokuchaev and everyone, Thank for your code. I have already trained and exported to frozen model (.pb) without optimize. Then i use frozen model for tensorrt inference server. but I got the wrong result compared to using infer.py file (checkpoints)

I don't know if my export .pb file is incorrect or my clients code are wrong

#### MY CLIENTS CODE #####
import argparse
import os
import random
import cv2
import numpy as np
import tensorflow as tf
import tensorflow.contrib.slim as slim
from builtins import range
from tensorrtserver.api import *
import tensorrtserver.api.model_config_pb2 as model_config


if __name__ == '__main__':
    input_name = 'input'
    output_name = 'd_predictions'
    model_name = 'lp-recognitor'
    protocol = ProtocolType.from_str('http')
    ctx = InferContext('localhost:8000', protocol, model_name, -1, False)

    img = cv2.imread("/data/tmp/plate/000111.png")
    img = np.float32(img)
    img = cv2.resize(img, (24, 94))

    in_frame = img.reshape((24, 94, 3)) 

    
    input_data = []
    input_data.append(in_frame) 

    results = [] 
    results.append(ctx.run(
            { input_name : input_data },
            { output_name : (InferContext.ResultFormat.RAW) }))
    print("****************results*********************", results)

How to run vtn_demo.py for AR

I notice the demo file vtn_demo in

https://github.com/opencv/openvino_training_extensions/tree/develop/pytorch_toolkit/action_recognition

where I can get the pretrained model for this program ?
where can i find the files correspond to "--checkpoint" and "--labels"

Thanks

pretrained action recognition model can not get the score 93.44%,where am I wrong?

I want to test the score of model with the command ‘python3 main.py --dataset ucf101_1 --model se-resnext101-32x4d_vtn_rgbdiff -b128 --lr 1e-5 --seq 16 --st 2 --no-mean-norm --no-std-norm --no-train --no-val --test --pretrain-path ../light_model/se_resnext_101_32x4d_vtn_rgbd_ucf101_s1.pth
’,but I only get the score of 84% ，how can I get the score as the models tell 93.44%``

output format pose estimation

I needed to implement json output for the pose estimation for every image. I can't understand all the parameters of the network.
So we have pose_entries and all_keypoints, what do they mean?
pose_entries is an array of 20 elements, 18 elements are responsible for each points in a pose (-1 if there are no points, the number means the line number of all_keypoints array. What does mean 19th element? Confidence?
all_keypoints has a vector of 4 elements. What does mean the 3d and 4th? As I understood, 3d is confidence, and 4th is just a # of element. Is it correct?

LPR training train.py : "ModuleNotFoundError: No module named 'lpr'"

Expected Behavior

Training should start for the new data set configured.

Current Behavior

Traceback (most recent call last):

File "train.py", line 9, in
from lpr.toolbox.utils import dataset_size
ModuleNotFoundError: No module named 'lpr'
;

Steps to Reproduce (for bugs)

Training LPRnet Model based on the Readme file
1.To start training go to training_toolbox/lpr directory and type in command line:

python3 train.py chinese_lp/config.py

Context

It didn't allowed me to run the train.py

Solution suggestion

Change the location of the train.py or change in the from import lpr.toolbox

Your Environment

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
TensorFlow installed from anaconda:
TensorFlow version 1.10

Python version:
3.6.6

Include as many relevant details about the environment with which you experienced the bug.

Add target directory for init_venv.sh

For testing purpose, it can be useful to have several different virtual environments. I suggest adding an extra key to set target virtual environment directory.
For example: init_venv.sh [target_dir]

How to Training SSD Object Detection coco Demo with GPU

Hi
I want to training with GPU in SSD Object Detection - coco demo. but it is works with CPU only default. I don't know what am i missing.

Environment：

OS : Ubuntu 18.04
Server: 4 * 1080Ti
docker - tensorflow/tensorflow:latest-gpu
tensorflow-gpu 1.12.0

Thanks

Implement data augmentation in GPU

Data augmentation is performed on CPU in py_func right now, but GPU can do it more efficiently.
Possible solutions:

eval.py is printing "|||||||....."

After setting up, downloaded training_toolbox_tensorflow-develop from github and trained the synthetic chinese plates for 50000 steps, and while running the eval.py file for validation it is checking GT Label and then after showing Test acc and Time per step,it is printing "|||||....." and the validation process is not ending.

Expected Behavior

Expected a valid result

Current Behavior

Validation process is not finishing, it is printing "|||||...." while running the eval.py file

Your Environment

Ubuntu 16.04):
TensorFlow installed from binary:
TensorFlow version 1.12.0


- Python version:
- python3 - 3.5.2
- CUDA/cuDNN version: 7.5

lpr pretrained model url is out of date

in readme.md:
"Trained model: LPRNet 94x24. " is not exist.
the correct url is : https://download.01.org/openvinotoolkit/training_toolbox_tensorflow/models/lpr/chinese_lp/license-plate-recognition-barrier-0007.tar.gz

infer using .pb model

hello @snosov1 . Thank for your code. Can you help me to infer license plate using .pb model?

While running the eval.py file it is printing "|||||....."

Hi,

OS : UBUNTU - 16.04 version
After running eval.py file it is printing "||||...." continuously and the process is not stopping. Could you please suggest why eval.py file is not printing a valid result?

TIA ,
Bhargavi.

Unexpected Error When running human_pose training

yh@yh-DJ-H310M-E:~/tutorials/openvino_training_extensions/pytorch_toolkit/human_pose_estimation$ python3 train.py --train-images-folder ../../data/coco/train2017/ --prepared-train-labels prepared_train_annotation.pkl --val-labels val_subset.json --val-images-folder ../../data/coco/val2017/ --checkpoint-path models/mobilenet_sgd_68.848.pth.tar --from-mobilenet

[WARNING] Not found pre-trained parameters for model.0.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.1.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.1.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.2.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.2.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.3.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.3.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.4.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.4.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.5.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.5.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.6.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.6.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.7.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.7.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.8.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.8.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.9.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.9.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.10.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.10.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.11.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for model.11.4.num_batches_tracked
[WARNING] Not found pre-trained parameters for cpm.align.0.weight
[WARNING] Not found pre-trained parameters for cpm.align.0.bias
[WARNING] Not found pre-trained parameters for cpm.trunk.0.0.weight
[WARNING] Not found pre-trained parameters for cpm.trunk.0.2.weight
[WARNING] Not found pre-trained parameters for cpm.trunk.1.0.weight
[WARNING] Not found pre-trained parameters for cpm.trunk.1.2.weight
[WARNING] Not found pre-trained parameters for cpm.trunk.2.0.weight
[WARNING] Not found pre-trained parameters for cpm.trunk.2.2.weight
[WARNING] Not found pre-trained parameters for cpm.conv.0.weight
[WARNING] Not found pre-trained parameters for cpm.conv.0.bias
[WARNING] Not found pre-trained parameters for initial_stage.trunk.0.0.weight
[WARNING] Not found pre-trained parameters for initial_stage.trunk.0.0.bias
[WARNING] Not found pre-trained parameters for initial_stage.trunk.1.0.weight
[WARNING] Not found pre-trained parameters for initial_stage.trunk.1.0.bias
[WARNING] Not found pre-trained parameters for initial_stage.trunk.2.0.weight
[WARNING] Not found pre-trained parameters for initial_stage.trunk.2.0.bias
[WARNING] Not found pre-trained parameters for initial_stage.heatmaps.0.0.weight
[WARNING] Not found pre-trained parameters for initial_stage.heatmaps.0.0.bias
[WARNING] Not found pre-trained parameters for initial_stage.heatmaps.1.0.weight
[WARNING] Not found pre-trained parameters for initial_stage.heatmaps.1.0.bias
[WARNING] Not found pre-trained parameters for initial_stage.pafs.0.0.weight
[WARNING] Not found pre-trained parameters for initial_stage.pafs.0.0.bias
[WARNING] Not found pre-trained parameters for initial_stage.pafs.1.0.weight
[WARNING] Not found pre-trained parameters for initial_stage.pafs.1.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.initial.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.initial.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.0.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.0.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.0.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.0.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.0.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.0.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.0.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.1.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.1.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.1.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.1.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.1.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.1.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.0.trunk.1.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.initial.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.initial.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.0.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.0.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.0.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.0.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.0.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.0.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.0.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.1.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.1.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.1.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.1.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.1.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.1.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.1.trunk.1.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.initial.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.initial.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.0.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.0.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.0.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.0.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.0.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.0.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.0.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.1.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.1.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.1.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.1.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.1.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.1.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.2.trunk.1.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.initial.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.initial.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.0.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.0.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.0.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.0.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.0.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.0.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.0.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.1.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.1.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.1.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.1.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.1.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.1.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.3.trunk.1.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.initial.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.initial.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.0.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.0.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.0.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.0.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.0.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.0.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.0.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.1.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.1.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.1.1.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.1.1.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.1.1.running_mean
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.1.1.running_var
[WARNING] Not found pre-trained parameters for refinement_stages.0.trunk.4.trunk.1.1.num_batches_tracked
[WARNING] Not found pre-trained parameters for refinement_stages.0.heatmaps.0.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.heatmaps.0.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.heatmaps.1.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.heatmaps.1.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.pafs.0.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.pafs.0.0.bias
[WARNING] Not found pre-trained parameters for refinement_stages.0.pafs.1.0.weight
[WARNING] Not found pre-trained parameters for refinement_stages.0.pafs.1.0.bias
Traceback (most recent call last):
File "train.py", line 170, in
args.checkpoint_after, args.val_after)
File "train.py", line 80, in train
for batch_data in train_loader:
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 193, in iter
return _DataLoaderIter(self)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 469, in init
w.start()
File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.5/multiprocessing/context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/usr/lib/python3.5/multiprocessing/context.py", line 267, in _Popen
return Popen(process_obj)
File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in init
self._launch(process_obj)
File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 67, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

AttributeError: 'JPEG' object has no attribute 'decompressor'

WARNING:tensorflow:Can't decode with jpeg4py (libjpeg-turbo): Could not load libjpeg-turbo library. Will use OpenCV. Exception ignored in: <bound method JPEG.__del__ of <jpeg4py._py.JPEG object at 0x7f0e8962d710>> Traceback (most recent call last): File "/home/lfy/anaconda3/lib/python3.6/site-packages/jpeg4py/_py.py", line 215, in __del__ if self.decompressor is not None: AttributeError: 'JPEG' object has no attribute 'decompressor'

how can I infer an image directly?

in official code, if we want to infer images, we should generate "infer.file_list_path (test_infer)" firstly, and then the code will read the image path in test_infer.

if I just want to infer one image, and read the image directly, how to modify the code?

Train with crop image dataset

I trained Global Context Module with MobileNet v1, MobileNet v2 and ShuffleNet v2 on full frame image dataset and crop human image dataset (crop from full frame image). With full frame image dataset, MobileNet v2 > MobileNet v1 > ShuffleNet v2. With crop image, MobileNet v2 > ShuffleNet v2, and better than with full frame image, but MobileNet v1, the accuracy decreases 20% :(( and lower than ShuffleNet v2. Can you help me? Why?

openvinotoolkit / training_extensions Goto Github PK

training_extensions's Introduction

OpenVINO™ Training Extensions

Introduction

Key Features

Installation

Quick-Start

Updates

v2.0.0 (1Q24)

Release History

Branches

License

Issues / Discussions

Known limitations

Disclaimer

Contributing

training_extensions's People

Contributors

Stargazers

Watchers

Forkers

training_extensions's Issues

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

Expected Behavior

Your Environment

Steps to Reproduce (for bugs)

Your Environment

Current Behavior

Steps to Reproduce (for bugs)

Your Environment

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

Command used to evaluate images on Windows machine:

Expected Behavior

Current Behavior

Steps to Reproduce

Environment setup and system configuration:

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

Steps to Reproduce (for bugs)

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Solution suggestion

Your Environment

Expected Behavior

Current Behavior

Your Environment

Recommend Projects

Recommend Topics

Recommend Org