yuliang-liu / box_discretization_network Goto Github PK
View Code? Open in Web Editor NEWOmnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.
License: Other
Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.
License: Other
Thx for you share. And I have one question is how to train your model with HRSC2016, what should I focus on?
How to create a json file for ICDAR13?
I am impressed with your research, and really want to try it.
Please upload all model files to GG Drive. I am outside of China, can not connect to Baidu Disk.
Thank you.
in /workspace/sign/Box_Discretization_Network-master/maskrcnn_benchmark/structures/segmentation_mask.py:427
selected_polygons.append(self.polygons[i])
But i is always True
I guess this is a bug?
Which file should I put in _indexes for label_file to ICDAR 2015 dataset? I want generate a new .json because I've made some motifications in the IC15 images to evaluate specific challenges in natural scenes.
I know the first 9 triples meaning, what does the last triple mean?
When I trained the model ,using the dataset ic15 ,I get this error :
Original Traceback (most recent call last):
File "/home/.local/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/OCR/scene_text/Box_DN/maskrcnn_benchmark/data/datasets/word_dataset.py", line 64, in getitem
img, anno = super(WordDataset, self).getitem(idx)
File "/home/.local/lib/python3.6/site-packages/torchvision/datasets/coco.py", line 118, in getitem
img, target = self.transforms(img, target)
File "/home/OCR/scene_text/Box_DN/maskrcnn_benchmark/data/transforms/transforms.py", line 24, in call
image, target = t(image, target)
File "/home/dulin/OCR/scene_text/Box_DN/maskrcnn_benchmark/data/transforms/transforms.py", line 70, in call
target = [t.resize(image.size) for t in target]
File "/home/OCR/scene_text/Box_DN/maskrcnn_benchmark/data/transforms/transforms.py", line 70, in
target = [t.resize(image.size) for t in target]
AttributeError: 'dict' object has no attribute 'resize'
I tracking the code ,I found the target was annotations , so I have no idea about this ,Any one have this problem ? and how to fixed this ? thanks .
在测试的时候传入的是图片文件夹的路径,如果直接传入图片数组数据怎么传入呢?
怎么把icdar那种格式的数据转换成coco格式呀
Thank you for sharing this great work. I have been trying the dataset conversion script you posted. It is not working for IC15. I have one question. Can I use the ABCNeT data set format conversion for this repo? Thank you in advance for your time and kind response.
... Thanks!
2019-10-24 11:16:53,219 maskrcnn_benchmark.utils.checkpoint INFO: No checkpoint found. Initializing model from scratch
loading annotations into memory...
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!
2019-10-24 11:16:53,293 maskrcnn_benchmark.trainer INFO: Start training
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!
Done (t=0.06s)
creating index...
index created!
loading annotations into memory...
Done (t=0.07s)
creating index...
index created!
hi, when training, after printing the log above, the program is finished, and the training is not started
Thank you very much for sharing this code. can you help me with this error:
CUDA_VISIBLE_DEVICES=0 python demo/test_single_image.py --min-image-size 1000 --config-file /gpu/wangbeibei/code/Box/configs/r50_baseline.yaml --output_dir /gpu/wangbeibei/code/Box/results/ --img /gpu/wangbeibei/code/Box/testimages
image path /gpu/wangbeibei/code/Box/testimages/3.jpg
Segmentation fault
您好!我训练了几个模型,然而运行demo里面的test_single_image.py时,每个模型的预测结果完全一样? 甚至WEIGHT路径为空时也会有相同的结果?
1.生成label(kes)的脚本文件有误:
anno需要新增的是kes和match_type 不是keypoints;
2.coco里加载target有误:
target.addfield应该是textKES和MTY而不是PersonKeypoints;
改完之后能跑起来
How to run with orchvision=0.3?
你好,我看test_single_image.py里面是测试单张图像的,请问可以一次性预测多张图像吗?
2019-10-20 14:20:06,875 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
File "tools/train_net.py", line 174, in
main()
File "tools/train_net.py", line 167, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 73, in train
arguments,
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/engine/trainer.py", line 56, in do_train
for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
FileNotFoundError: Traceback (most recent call last):
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/word_dataset.py", line 64, in getitem
img, anno = super(WordDataset, self).getitem(idx)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torchvision/datasets/coco.py", line 117, in getitem
img = Image.open(os.path.join(self.root, path)).convert('RGB')
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/PIL/Image.py", line 2766, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/images/2019-07-02_103547.png'
But the image exactly exits
In the online demo, what is the difference between the SCENE TEXT
vs HANDWRITTEN TEXT
?
Upload the .yaml
configuration for both.
I saw you use the cross entropy as the loss function, did you try different losses?
python 3.6
出现此错误
Hi author,
when i use coco2017 data set to train, occur the below error: copy_if failed to synchronize: device-side assert triggered ; would you please give me some advice?
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [26,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
File "/home/kv/workspace/Box_Discretization_Network-master/tools/train_net.py", line 174, in <module>
main()
File "/home/kv/workspace/Box_Discretization_Network-master/tools/train_net.py", line 167, in main
model = train(cfg, args.local_rank, args.distributed)
File "/home/kv/workspace/Box_Discretization_Network-master/tools/train_net.py", line 73, in train
arguments,
File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/engine/trainer.py", line 70, in do_train
loss_dict = model(images, targets)
File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 56, in forward
x, result, detector_losses = self.roi_heads(features, proposals, targets)
File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 32, in forward
x, detections, loss_box = self.box(features, proposals, targets)
File "/home/kv/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/modeling/roi_heads/box_head/box_head.py", line 56, in forward
[class_logits], [box_regression]
File "/home/kv/workspace/Box_Discretization_Network-master/maskrcnn_benchmark/modeling/roi_heads/box_head/loss.py", line 151, in __call__
sampled_pos_inds_subset = torch.nonzero(labels > 0).squeeze(1)
RuntimeError: copy_if failed to synchronize: device-side assert triggered
the config.json is below:
OUTPUT_DIR: "./output/r50_baseline"
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
WEIGHT: "/home/kv/data/fairy/r50_imagenet_pretrained.pth"
BACKBONE:
CONV_BODY: "R-50-FPN"
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
ASPECT_RATIOS: (0.25, 0.5, 1.0, 2.0, 4.0)
ROI_HEADS:
USE_FPN: True
SCORE_THRESH: 0.05
NMS: 0.5
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
NUM_CLASSES: 2
MASK_ON: True
ROI_MASK_HEAD:
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
FEATURE_EXTRACTOR: "MaskRCNNFPNFeatureExtractor"
PREDICTOR: "MaskRCNNC4Predictor"
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 2
RESOLUTION: 28
SHARE_BOX_FEATURE_EXTRACTOR: False
# BDN KE
KE_ON: True
ROI_KE_HEAD:
FEATURE_EXTRACTOR: "KERCNNFPNFeatureExtractor"
POOLER_RESOLUTION: 14
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
PREDICTOR: "KERCNNC4Predictor"
RESOLUTION: 56
SHARE_BOX_FEATURE_EXTRACTOR: False
NUM_KES: 8
KE_WEIGHT: 0.1
MTY_WEIGHT: 0.01
POSTPROCESS_KES: True # Must be true
RESCORING: False
RESCORING_GAMA: 1.4 # [0,2]
PROCESS:
PNMS: False
NMS_THRESH: 0.2
DATASETS:
TRAIN: ("coco_2017_train", )
TEST: ("coco_2017_val",)
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
BASE_LR: 0.0001
BIAS_LR_FACTOR: 2
WEIGHT_DECAY: 0.0001
STEPS: (10000, 15000)
MAX_ITER: 20000
IMS_PER_BATCH: 4
CHECKPOINT_PERIOD: 2500
INPUT:
MIN_SIZE_TRAIN: (680,720,760,800,840,880,920,960,1000)
MAX_SIZE_TRAIN: 1480
MIN_SIZE_TEST: 1200
MAX_SIZE_TEST: 1600
CROP_PROB_TRAIN: 0.0
ROTATE_PROB_TRAIN: 0.0
ROTATE_DEGREE: 15
TEST:
IMS_PER_BATCH: 3
Hi! Could you provide a pretrained model on ICDAR 2013 or 2015 or 2017? Thanks a lot!
您好,不好意思又打扰您,我在用多GPU训练的时候,设置了os.environ["CUDA_VISIBLE_DEVICES"] = "1,2",但是实际在1GPU上训练,2GPU上并没有训练,还要其他设置吗?
运行 python setup.py build develop 时候,通过网络查看了可能是torch 和 cuda 版本不匹配,但还没有找到处理问题的方法
具体报错:
In file included from /home/rebot/anaconda3/envs/cwt_DeepLearning/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:0,
from /home/rebot/DeepLearning/Box_Discretization_Network-master/maskrcnn_benchmark/csrc/cpu/dcn_v2_cpu.cpp:4:
/home/rebot/anaconda3/envs/cwt_DeepLearning/lib/python3.7/site-packages/torch/include/c10/cuda/CUDAStream.h:6:10: fatal error: cuda_runtime_api.h: 没有那个文件或目录
#include <cuda_runtime_api.h>
^~~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command 'gcc' failed with exit status 1
请问大家发现这个问题的解决办法了吗?
你好,请问您的class.txt里是什么内容呢?
@Yuliang-Liu
Detectron2 have been released, it is faster than maskrcnn-benchmark in training.
Are you considering upgrading?
Hello, would you like to know how the text segmentation branch fine-tune the original prediction structure and how it interacts with ke head and bboxes branches to improve accuracy?
Which version of OpenCV do you run?
Thanks
index created!
2019-12-23 15:27:03,388 maskrcnn_benchmark.inference INFO: Start evaluation on ic15_test dataset(500 images).
0%| | 0/167 [00:00<?, ?it/s]THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 167/167 [01:03<00:00, 2.63it/s]
2019-12-23 15:28:06,865 maskrcnn_benchmark.inference INFO: Total run time: 0:01:03.476810 (0.12695361995697022 s / img per device, on 1 devices)
2019-12-23 15:28:06,866 maskrcnn_benchmark.inference INFO: Model inference time: 0:00:52.991849 (0.10598369789123535 s / img per device, on 1 devices)
2019-12-23 15:28:07,013 maskrcnn_benchmark.inference INFO: Preparing results for COCO format
2019-12-23 15:28:07,013 maskrcnn_benchmark.inference INFO: Preparing bbox results
2019-12-23 15:28:07,119 maskrcnn_benchmark.inference INFO: Preparing segm results
0it [00:00, ?it/s]
Traceback (most recent call last):
File "tools/test_net.py", line 99, in <module>
main()
File "tools/test_net.py", line 93, in main
output_folder=output_folder,
File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/engine/inference.py", line 129, in inference
**extra_args)
File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/__init__.py", line 27, in evaluate
return word_evaluation(**args)
File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/__init__.py", line 20, in word_evaluation
expected_results_sigma_tol=expected_results_sigma_tol,
File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/word_eval.py", line 54, in do_coco_evaluation
coco_results["segm"] = prepare_for_coco_segmentation(predictions, dataset)
File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/word_eval.py", line 176, in prepare_for_coco_segmentation
rects = [mask_to_roRect(mask, [image_height, image_width]) for mask in masks]
File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/word_eval.py", line 176, in <listcomp>
rects = [mask_to_roRect(mask, [image_height, image_width]) for mask in masks]
File "/home/premy/projects/Box_Discretization_Network/Box_Discretization_Network/maskrcnn_benchmark/data/datasets/evaluation/word/word_eval.py", line 140, in mask_to_roRect
_, countours, hier = cv2.findContours(e.clone().numpy(), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE) # Aarlog
ValueError: not enough values to unpack (expected 3, got 2)
你好,我在尝试在你们比赛的数据集上跑一下这个模型,但是将数据集转成模型兼容格式时发现你们提供的脚本中有个 segs = [int(kkpart) for kkpart in parts[4:]]部分,这个parts具体对应到label中的哪个坐标呢?因为gt已经由parts[0:3}给出了。谢谢。
inference时候 怎么用呢?对于不规则四边形数据集,我想是直接用kes的预测结果,但没有提供相应的后处理办法,需要自己写吗,。
Traceback (most recent call last):
File "tools/train_net.py", line 174, in
main()
File "tools/train_net.py", line 167, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 73, in train
arguments,
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/engine/trainer.py", line 70, in do_train
loss_dict = model(images, targets)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 56, in forward
x, result, detector_losses = self.roi_heads(features, proposals, targets)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 71, in forward
x, detections, loss_ke, loss_mty = self.ke(ke_features, detections, targets)
File "/home2/Jc/anaconda3/envs/mb/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/ke_head/ke_head.py", line 64, in forward
proposals = self.loss_evaluator.subsample(proposals, targets)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/ke_head/loss.py", line 107, in subsample
labels, kes, mty = self.prepare_targets(proposals, targets)
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/ke_head/loss.py", line 68, in prepare_targets
proposals_per_image, targets_per_image
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/modeling/roi_heads/ke_head/loss.py", line 53, in match_targets_to_proposals
target = target.copy_with_fields(["labels", "kes", "mty"])
File "/home2/Jc/Box_Discretization_Network/maskrcnn_benchmark/structures/bounding_box.py", line 251, in copy_with_fields
raise KeyError("Field '{}' not found in {}".format(field, self))
KeyError: "Field 'kes' not found in BoxList(num_boxes=5, image_width=1271, image_height=720, mode=xyxy)"
HOW to solve this problem
作者你好,我最近在使用hrsc2016数据集,看到您论文在这个数据集的结果很高,所以跑了一下您的代码,发现几个问题需要请教您?
1、文中使用coco评估脚本,也就是说采用12的计算标准,而目前评估hrsc2016数据集都是用07标准的,12会比07高不少,论文这样比较是不是有失公平;
2、另外我发现在评估hrsc2016数据集的时候计算的是两个水平外接矩形的iou?这样会严重影响最后评估结果的可行度,如下图的例子:
希望作者不吝赐教。
@Yuliang-Liu
Why the results of TIoU are so low?
Mention that we have to change the IDIR and ODIR in single_image_demo.sh before launching the script.
Also, the output dir is not really created and the generated pictures show up in the main directory.
-rw-rw-r-- 1 premy premy 181K Dec 23 15:36 output_dir0000000.jpg
-rw-rw-r-- 1 premy premy 333K Dec 23 15:36 output_dir0000001.jpg
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.