yuliang-liu / box_discretization_network Goto Github PK

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

License: Other

Python 82.90% C++ 4.70% Cuda 11.58% C 0.74% Shell 0.08%

box_discretization_network's People

Contributors

Stargazers

Watchers

Forkers

zhangshuaitao hajungong007 kapitsa2811 shengzhang90 lele-xie happog trendingtechnology yuckfu sanster hciilab hhgxx123 sunting78 qutrino garspace bygreencn rkshuai wuxiaolianggit super-ljg sild67373 magnetstone dexception hell-to-heaven basaltzhang alwc xiaoyubing wenmuzhou verazjy zm931116 jeffrey98-ai yanggui19891007 simplify23 banyueqin liu100286 yangtong1989 undercontroller yanqi1811 xgmiao zhxzhlx charminglittedeveloper challenging6 neocats ttyhu hardsoft2023 stanstarks demonprofessor dxysharon creatorcen yeosan chengmuni66 yutingliu xiyuan27 lqyiii annihilation7 chadpieere dun933 duanjiaqi dy1998 forlovezed jnyle zbpjlc wyc2015fq w32zhong peternara wbb123 fengpan1010 scgu jireh-father zwahern wei-ucas dtrealm xiangliu886 lkampoli coorful 5l1v3r1 dikubab judy09 patty5531998 haopo2005 ycchen-tw jxncyym ocrorg aiedward hellmo718

box_discretization_network's Issues

Train on HRSC2016

Thx for you share. And I have one question is how to train your model with HRSC2016, what should I focus on?

Is it possible to build without CUDA?

How to create a json file for ICDAR13?

Please upload model files to GG Drive

I am impressed with your research, and really want to try it.
Please upload all model files to GG Drive. I am outside of China, can not connect to Baidu Disk.
Thank you.

Bug: item is boolean

in /workspace/sign/Box_Discretization_Network-master/maskrcnn_benchmark/structures/segmentation_mask.py:427

selected_polygons.append(self.polygons[i])

But i is always True

I guess this is a bug?

About generate my own dataset with .json

Which file should I put in _indexes for label_file to ICDAR 2015 dataset? I want generate a new .json because I've made some motifications in the IC15 images to evaluate specific challenges in natural scenes.

How to generate keypoints in train.json?

I know the first 9 triples meaning, what does the last triple mean?

data augmentation Issue

When I trained the model ,using the dataset ic15 ,I get this error :

Original Traceback (most recent call last):
File "/home/.local/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/OCR/scene_text/Box_DN/maskrcnn_benchmark/data/datasets/word_dataset.py", line 64, in getitem
img, anno = super(WordDataset, self).getitem(idx)
File "/home/.local/lib/python3.6/site-packages/torchvision/datasets/coco.py", line 118, in getitem
img, target = self.transforms(img, target)
File "/home/OCR/scene_text/Box_DN/maskrcnn_benchmark/data/transforms/transforms.py", line 24, in call
image, target = t(image, target)
File "/home/dulin/OCR/scene_text/Box_DN/maskrcnn_benchmark/data/transforms/transforms.py", line 70, in call
target = [t.resize(image.size) for t in target]
File "/home/OCR/scene_text/Box_DN/maskrcnn_benchmark/data/transforms/transforms.py", line 70, in
target = [t.resize(image.size) for t in target]
AttributeError: 'dict' object has no attribute 'resize'

I tracking the code ,I found the target was annotations , so I have no idea about this ,Any one have this problem ? and how to fixed this ? thanks .

您好！请问数据输入格式问题？

在测试的时候传入的是图片文件夹的路径，如果直接传入图片数组数据怎么传入呢？

数据转换成coco

怎么把icdar那种格式的数据转换成coco格式呀

Dataset conversion

Thank you for sharing this great work. I have been trying the dataset conversion script you posted. It is not working for IC15. I have one question. Can I use the ABCNeT data set format conversion for this repo? Thank you in advance for your time and kind response.

ETA for online demo?

... Thanks!

Unable to training

2019-10-24 11:16:53,219 maskrcnn_benchmark.utils.checkpoint INFO: No checkpoint found. Initializing model from scratch
loading annotations into memory...
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!
2019-10-24 11:16:53,293 maskrcnn_benchmark.trainer INFO: Start training
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!
Done (t=0.06s)
creating index...
index created!
loading annotations into memory...
Done (t=0.07s)
creating index...
index created!

hi, when training, after printing the log above, the program is finished, and the training is not started

Segmentation fault

Thank you very much for sharing this code. can you help me with this error:
CUDA_VISIBLE_DEVICES=0 python demo/test_single_image.py --min-image-size 1000 --config-file /gpu/wangbeibei/code/Box/configs/r50_baseline.yaml --output_dir /gpu/wangbeibei/code/Box/results/ --img /gpu/wangbeibei/code/Box/testimages
image path /gpu/wangbeibei/code/Box/testimages/3.jpg
Segmentation fault

代码里面ke-head和论文里面不吻合

代码里

论文里

demo test问题？

您好！我训练了几个模型，然而运行demo里面的test_single_image.py时，每个模型的预测结果完全一样？甚至WEIGHT路径为空时也会有相同的结果？

程序有几个小bug无法正常运行

1.生成label(kes)的脚本文件有误：
anno需要新增的是kes和match_type 不是keypoints；
2.coco里加载target有误：
target.addfield应该是textKES和MTY而不是PersonKeypoints；

改完之后能跑起来

How to run with orchvision=0.3?

请问可以一次性预测多张图吗？

你好，我看test_single_image.py里面是测试单张图像的，请问可以一次性预测多张图像吗？

Cannot find file

2019-10-20 14:20:06,875 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
File "tools/train_net.py", line 174, in
main()
File "tools/train_net.py", line 167, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 73, in train
arguments,
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/engine/trainer.py", line 56, in do_train
for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
FileNotFoundError: Traceback (most recent call last):
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/word_dataset.py", line 64, in getitem
img, anno = super(WordDataset, self).getitem(idx)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torchvision/datasets/coco.py", line 117, in getitem
img = Image.open(os.path.join(self.root, path)).convert('RGB')
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/PIL/Image.py", line 2766, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/images/2019-07-02_103547.png'

But the image exactly exits

icdar to coco

The following happens when I execute the script you gave to convert the data format

Please ask what this txt file refers to

只是检测没有识别，不是端到端吗？

关于执行to_eval文件的问题

How to use convert_to_BDN_sequence_free (1).py

i don't know how to use convert_to_BDN_sequence_free (1).py file .
Do I need to execute the file according to the command below, and if so, does data refer to the path of the dataset that needs to be converted, so what does the train mean?

SCENE TEXT vs HANDWRITTEN TEXT in demo

@Yuliang-Liu

In the online demo, what is the difference between the SCENE TEXT vs HANDWRITTEN TEXT?
Upload the .yaml configuration for both.

why not the smooth l1 loss or others in ke_head/loss.py?

I saw you use the cross entropy as the loss function, did you try different losses?

undefined symbol: __cudaRegisterFatBinaryEnd

python 3.6
出现此错误

RuntimeError: copy_if failed to synchronize: device-side assert triggered

Hi author,
when i use coco2017 data set to train, occur the below error: copy_if failed to synchronize: device-side assert triggered ; would you please give me some advice?

/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [26,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
  File "/home/kv/workspace/Box_Discretization_Network-master/tools/train_net.py", line 174, in <module>
    main()
  File "/home/kv/workspace/Box_Discretization_Network-master/tools/train_net.py", line 167, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "/home/kv/workspace/Box_Discretization_Network-master/tools/train_net.py", line 73, in train
    arguments,
  File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/engine/trainer.py", line 70, in do_train
    loss_dict = model(images, targets)
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 56, in forward
    x, result, detector_losses = self.roi_heads(features, proposals, targets)
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 32, in forward
    x, detections, loss_box = self.box(features, proposals, targets)
  File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/modeling/roi_heads/box_head/box_head.py", line 56, in forward
    [class_logits], [box_regression]
  File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/modeling/roi_heads/box_head/loss.py", line 151, in __call__
    sampled_pos_inds_subset = torch.nonzero(labels > 0).squeeze(1)
RuntimeError: copy_if failed to synchronize: device-side assert triggered

the config.json is below:

OUTPUT_DIR: "./output/r50_baseline"
MODEL:
  META_ARCHITECTURE: "GeneralizedRCNN"
  WEIGHT: "/home/kv/data/fairy/r50_imagenet_pretrained.pth"
  BACKBONE:
    CONV_BODY: "R-50-FPN"
  RESNETS:
    BACKBONE_OUT_CHANNELS: 256
  RPN:
    USE_FPN: True
    ANCHOR_STRIDE: (4, 8, 16, 32, 64)
    ASPECT_RATIOS: (0.25, 0.5, 1.0, 2.0, 4.0)
  ROI_HEADS:
    USE_FPN: True
    SCORE_THRESH: 0.05
    NMS: 0.5
  ROI_BOX_HEAD:
    POOLER_RESOLUTION: 7
    POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
    POOLER_SAMPLING_RATIO: 2
    FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
    PREDICTOR: "FPNPredictor"
    NUM_CLASSES: 2
  MASK_ON: True
  ROI_MASK_HEAD:
    POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
    FEATURE_EXTRACTOR: "MaskRCNNFPNFeatureExtractor"
    PREDICTOR: "MaskRCNNC4Predictor"
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 2
    RESOLUTION: 28
    SHARE_BOX_FEATURE_EXTRACTOR: False
  # BDN KE
  KE_ON: True
  ROI_KE_HEAD:
    FEATURE_EXTRACTOR: "KERCNNFPNFeatureExtractor"
    POOLER_RESOLUTION: 14
    POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
    POOLER_SAMPLING_RATIO: 2
    PREDICTOR: "KERCNNC4Predictor"
    RESOLUTION: 56
    SHARE_BOX_FEATURE_EXTRACTOR: False
    NUM_KES: 8
    KE_WEIGHT: 0.1
    MTY_WEIGHT: 0.01
    POSTPROCESS_KES: True # Must be true 
    RESCORING: False
    RESCORING_GAMA: 1.4 # [0,2]
PROCESS:
  PNMS: False
  NMS_THRESH: 0.2
DATASETS:
  TRAIN: ("coco_2017_train", )
  TEST: ("coco_2017_val",)
DATALOADER:
  SIZE_DIVISIBILITY: 32
SOLVER:
  BASE_LR: 0.0001
  BIAS_LR_FACTOR: 2
  WEIGHT_DECAY: 0.0001
  STEPS: (10000, 15000)
  MAX_ITER: 20000
  IMS_PER_BATCH: 4
  CHECKPOINT_PERIOD: 2500
INPUT:
  MIN_SIZE_TRAIN: (680,720,760,800,840,880,920,960,1000)
  MAX_SIZE_TRAIN: 1480
  MIN_SIZE_TEST:  1200
  MAX_SIZE_TEST: 1600
  CROP_PROB_TRAIN: 0.0 
  ROTATE_PROB_TRAIN: 0.0
  ROTATE_DEGREE: 15
TEST:
  IMS_PER_BATCH: 3

Could you please provide a pretrained model

Hi! Could you provide a pretrained model on ICDAR 2013 or 2015 or 2017? Thanks a lot!

多GPU训练问题？

您好，不好意思又打扰您，我在用多GPU训练的时候，设置了os.environ["CUDA_VISIBLE_DEVICES"] = "1,2"，但是实际在1GPU上训练，2GPU上并没有训练，还要其他设置吗？

AttributeError: 'dict' object has no attribute 'resize'

The target is a list and the target[0] is a dict. So the error is occur, but I don't realize where the source of this error and how to modify?
The dataset is ic15.

fatal error: cuda_runtime_api.h: 没有那个文件或目录

运行 python setup.py build develop 时候，通过网络查看了可能是torch 和 cuda 版本不匹配，但还没有找到处理问题的方法

具体报错：
In file included from /home/rebot/anaconda3/envs/cwt_DeepLearning/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:0,
from /home/rebot/DeepLearning/Box_Discretization_Network-master/maskrcnn_benchmark/csrc/cpu/dcn_v2_cpu.cpp:4:
/home/rebot/anaconda3/envs/cwt_DeepLearning/lib/python3.7/site-packages/torch/include/c10/cuda/CUDAStream.h:6:10: fatal error: cuda_runtime_api.h: 没有那个文件或目录
#include <cuda_runtime_api.h>
^~~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command 'gcc' failed with exit status 1

请问大家发现这个问题的解决办法了吗？

用自己的数据生成json

你好，请问您的class.txt里是什么内容呢？

Detectron2

@Yuliang-Liu
Detectron2 have been released, it is faster than maskrcnn-benchmark in training.
Are you considering upgrading?

about The role of text segmentation branches

Hello, would you like to know how the text segmentation branch fine-tune the original prediction structure and how it interacts with ke head and bboxes branches to improve accuracy?

你好，我想用CTW1500数据集进行训练，需要修改哪些部分呢

About convert_to_BDN_sequence_free.py

请问segs中的点代表什么含义，比如说我的txt坐标格式为左上为起点顺时针的排序

When running [bash my_test.sh], I have an OpenCV issue.

Which version of OpenCV do you run?

Thanks

index created!
2019-12-23 15:27:03,388 maskrcnn_benchmark.inference INFO: Start evaluation on ic15_test dataset(500 images).
  0%|                                                                                                                                          | 0/167 [00:00<?, ?it/s]THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 167/167 [01:03<00:00,  2.63it/s]
2019-12-23 15:28:06,865 maskrcnn_benchmark.inference INFO: Total run time: 0:01:03.476810 (0.12695361995697022 s / img per device, on 1 devices)
2019-12-23 15:28:06,866 maskrcnn_benchmark.inference INFO: Model inference time: 0:00:52.991849 (0.10598369789123535 s / img per device, on 1 devices)
2019-12-23 15:28:07,013 maskrcnn_benchmark.inference INFO: Preparing results for COCO format
2019-12-23 15:28:07,013 maskrcnn_benchmark.inference INFO: Preparing bbox results
2019-12-23 15:28:07,119 maskrcnn_benchmark.inference INFO: Preparing segm results
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "tools/test_net.py", line 99, in <module>
    main()
  File "tools/test_net.py", line 93, in main
    output_folder=output_folder,
  File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/engine/inference.py", line 129, in inference
    **extra_args)
  File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/__init__.py", line 27, in evaluate
    return word_evaluation(**args)
  File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/__init__.py", line 20, in word_evaluation
    expected_results_sigma_tol=expected_results_sigma_tol,
  File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/word_eval.py", line 54, in do_coco_evaluation
    coco_results["segm"] = prepare_for_coco_segmentation(predictions, dataset)
  File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/word_eval.py", line 176, in prepare_for_coco_segmentation
    rects = [mask_to_roRect(mask, [image_height, image_width]) for mask in masks]
  File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/word_eval.py", line 176, in <listcomp>
    rects = [mask_to_roRect(mask, [image_height, image_width]) for mask in masks]
  File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/word_eval.py", line 140, in mask_to_roRect
    _, countours, hier = cv2.findContours(e.clone().numpy(), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE) # Aarlog
ValueError: not enough values to unpack (expected 3, got 2)

用自己的数据生成的json文件训练报错

json_file of ReCTs-competition datasets

你好，我在尝试在你们比赛的数据集上跑一下这个模型，但是将数据集转成模型兼容格式时发现你们提供的脚本中有个 segs = [int(kkpart) for kkpart in parts[4:]]部分，这个parts具体对应到label中的哪个坐标呢？因为gt已经由parts[0:3}给出了。谢谢。

inference 时没有处理ke预测结果的方法

inference时候怎么用呢？对于不规则四边形数据集，我想是直接用kes的预测结果，但没有提供相应的后处理办法，需要自己写吗，。

KES NOT FOUND

Traceback (most recent call last):
File "tools/train_net.py", line 174, in
main()
File "tools/train_net.py", line 167, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 73, in train
arguments,
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/engine/trainer.py", line 70, in do_train
loss_dict = model(images, targets)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 56, in forward
x, result, detector_losses = self.roi_heads(features, proposals, targets)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 71, in forward
x, detections, loss_ke, loss_mty = self.ke(ke_features, detections, targets)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/ke_head/ke_head.py", line 64, in forward
proposals = self.loss_evaluator.subsample(proposals, targets)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/ke_head/loss.py", line 107, in subsample
labels, kes, mty = self.prepare_targets(proposals, targets)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/ke_head/loss.py", line 68, in prepare_targets
proposals_per_image, targets_per_image
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/ke_head/loss.py", line 53, in match_targets_to_proposals
target = target.copy_with_fields(["labels", "kes", "mty"])
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/structures/bounding_box.py", line 251, in copy_with_fields
raise KeyError("Field '{}' not found in {}".format(field, self))
KeyError: "Field 'kes' not found in BoxList(num_boxes=5, image_width=1271, image_height=720, mode=xyxy)"

HOW to solve this problem

-rw-rw-r--  1 premy premy 181K Dec 23 15:36 output_dir0000000.jpg
-rw-rw-r--  1 premy premy 333K Dec 23 15:36 output_dir0000001.jpg