Git Product home page Git Product logo

devashishprasad / cascadetabnet Goto Github PK

View Code? Open in Web Editor NEW
1.5K 45.0 424.0 16.56 MB

This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"

License: MIT License

Python 100.00%
table-recognition table-structure-recognition table-detection table-detection-using-deep-learning

cascadetabnet's Introduction

👋 Hi there

🧬 I am currently harnessing generative AI to address the most complex challenges in Biology for advancements in drug discovery.

🌟 I have published 3 research papers in deep learning /machine learning. One of the papers was published at CVPR and has over 150 citations and 1.3K Github stars.

🎓 In the summer of 2023, I earned my Master of Science in Computer Science, specializing in Machine Learning, from Purdue University, achieving a GPA of 3.6/4.0.

🖥️ During my final year of graduate studies, I served as a Research Assistant at Kihara Lab, one of Purdue University's premier applied ML research facilities. There, I employed generative deep learning models for intricate protein structure analysis and established a machine learning serving infrastructure for these computation-intensive models (em.kiharalab.org).

👻 In the summer of 2022, I joined Snap Inc.'s Camera Platform team as a Machine Learning Engineer intern. My role involved developing and deploying a deep learning-based optical flow prediction model to automate Snapchat's video annotation process. This initiative enhanced video labeling and annotation efficiency by 15% for Snapchat's gigantic unlabelled video datasets.

🛰️ In my first year of graduate study, I collaborated with Viasat Inc. through Purdue's The Data Mine program as a Graduate Data Science Researcher. My research focused on developing Deep Learning algorithms to tackle blind image super-resolution challenges, intended specifically to enhance the quality of Viasat's internal satellite imagery.

🕸️ Before my relocation to the US, I completed five internships in India, focusing on Machine Learning and Deep Learning in the topics of News sentiment analysis, ML in finance, Sports vision analysis, Document understanding, Optical Character Recognition, Face Recognition, Fine-grained image classification, Chatbots, etc.

🏅 I was the Smart India Hackathon (India's biggest Hackathon) grand finalist three times. During which, I worked on ML/DL-based projects for ISRO, ITC Ltd, and DRDO (India's esteemed organizations).

www.devashishprasad.com

cascadetabnet's People

Contributors

akadirpamukcu avatar ayangadpal avatar devashishprasad avatar francescoperessini avatar kshitijkapadni avatar manishdv avatar mhmd-azeez avatar mrzilinxiao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cascadetabnet's Issues

Unable to fine-tune due to missing mask labels

Hi, I am currently fine-tuning the pre-trained model (epoch36.pth) but I am encountering an error whenever I load my custom dataset generated using LabelImg.

Traceback (most recent call last):
  File "tools/train.py", line 151, in <module>
    main()
  File "tools/train.py", line 147, in main
    meta=meta)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/apis/train.py", line 165, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/usr/local/lib/python3.6/dist-packages/mmcv/runner/runner.py", line 384, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/mmcv/runner/runner.py", line 279, in train
    for i, data_batch in enumerate(data_loader):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 394, in reraise
    raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/custom.py", line 132, in __getitem__
    data = self.prepare_train_img(idx)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/custom.py", line 145, in prepare_train_img
    return self.pipeline(results)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/pipelines/compose.py", line 24, in __call__
    data = t(data)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/pipelines/loading.py", line 147, in __call__
    results = self._load_masks(results)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/pipelines/loading.py", line 125, in _load_masks
    gt_masks = results['ann_info']['masks']
KeyError: 'masks'

I noticed specifically from the config file that the training pipeline requires masks to be enabled.

    dict(type='LoadAnnotations', with_bbox=True, with_mask=True),

Is there something to be done when annotating using LabelImg that you guys did differently to indicate the existence of label masks? I saw the example provided and did the same but still getting an error about masks. I also set with_mask=False but I don't honestly know how relevant would that be to the whole training process.

Example annotation from LabelImg:

<annotation>
	<folder>jpeg_images</folder>
	<filename>acc_2018_fs_008.jpg</filename>
	<path>/Users/rt/Desktop/99_annotated/jpeg_images/acc_2018_fs_008.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>4958</width>
		<height>7017</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
        <name>borderless</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>993</xmin>
            <ymin>1020</ymin>
            <xmax>4223</xmax>
            <ymax>5479</ymax>
        </bndbox>
    </object>
	<object>
		<name>cell</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>3559</xmin>
			<ymin>1047</ymin>
			<xmax>4021</xmax>
			<ymax>1107</ymax>
		</bndbox>
	</object>
</annotation>

Thank you and I appreciate this awesome work by the way.

Table structure recognition is not predicted for second table of demo image

First and foremost, thanks for this interesting paper and also this repository!

Now, as you can see in the README, in the demo gif not only both tables are detected but structure recognition is successful for both tables (in the last step of the animation).

However, when predicting this demo image, I get the different results:

image

As you can see in the screenshot, both tables are detected succesfully. But in the right table no cell is recognised. In the left table, cells in the last columns are also not recognised. I'm using the same checkpoint file and configuration as in the demo Jupyter notebook. I tried lowering the threshold, but that didn't help.

How can I improve the prediction so that I get the same performance as shown in the demo gif? Am I missing some postprocessing, or am I not using the optimal configuration, or something else? I'm not sure, I hope you could help.

Thanks!

XML output of extracted tabular text

Hi Devashish -

For reference, is it possible to upload the XML output results of extracted tabular text for a few example documents?

Thanks,
Sekhar H.

RuntimeError: cuda runtime error (209) : unrecognized error code at mmdet/ops/roi_align/src/roi_align_kernel.cu:139

I'm trying to run ICDAR-13 model. But I'm getting this error.


RuntimeError Traceback (most recent call last)

in ()
10
11 # Run Inference
---> 12 result = inference_detector(model, img)
13
14 # Visualization results

11 frames

/content/drive/My Drive/mmdetection/mmdet/apis/inference.py in inference_detector(model, img)
84 # forward the model
85 with torch.no_grad():
---> 86 result = model(return_loss=False, rescale=True, **data)
87 return result
88

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/content/drive/My Drive/mmdetection/mmdet/core/fp16/decorators.py in new_func(*args, **kwargs)
47 'method of nn.Module')
48 if not (hasattr(args[0], 'fp16_enabled') and args[0].fp16_enabled):
---> 49 return old_func(*args, **kwargs)
50 # get the arg spec of the decorated method
51 args_info = getfullargspec(old_func)

/content/drive/My Drive/mmdetection/mmdet/models/detectors/base.py in forward(self, img, img_metas, return_loss, **kwargs)
147 return self.forward_train(img, img_metas, **kwargs)
148 else:
--> 149 return self.forward_test(img, img_metas, **kwargs)
150
151 def show_result(self, data, result, dataset=None, score_thr=0.3):

/content/drive/My Drive/mmdetection/mmdet/models/detectors/base.py in forward_test(self, imgs, img_metas, **kwargs)
128 if 'proposals' in kwargs:
129 kwargs['proposals'] = kwargs['proposals'][0]
--> 130 return self.simple_test(imgs[0], img_metas[0], **kwargs)
131 else:
132 # TODO: support test augmentation for predefined proposals

/content/drive/My Drive/mmdetection/mmdet/models/detectors/cascade_rcnn.py in simple_test(self, img, img_metas, proposals, rescale)
340
341 bbox_feats = bbox_roi_extractor(
--> 342 x[:len(bbox_roi_extractor.featmap_strides)], rois)
343 if self.with_shared_head:
344 bbox_feats = self.shared_head(bbox_feats)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/content/drive/My Drive/mmdetection/mmdet/core/fp16/decorators.py in new_func(*args, **kwargs)
125 'method of nn.Module')
126 if not (hasattr(args[0], 'fp16_enabled') and args[0].fp16_enabled):
--> 127 return old_func(*args, **kwargs)
128 # get the arg spec of the decorated method
129 args_info = getfullargspec(old_func)

/content/drive/My Drive/mmdetection/mmdet/models/roi_extractors/single_level.py in forward(self, feats, rois, roi_scale_factor)
103 if inds.any():
104 rois_ = rois[inds, :]
--> 105 roi_feats_t = self.roi_layers[i](feats[i], rois_)
106 roi_feats[inds] = roi_feats_t
107 return roi_feats

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/content/drive/My Drive/mmdetection/mmdet/ops/roi_align/roi_align.py in forward(self, features, rois)
142 else:
143 return roi_align(features, rois, self.out_size, self.spatial_scale,
--> 144 self.sample_num, self.aligned)
145
146 def repr(self):

/content/drive/My Drive/mmdetection/mmdet/ops/roi_align/roi_align.py in forward(ctx, features, rois, out_size, spatial_scale, sample_num, aligned)
34 out_w)
35 roi_align_cuda.forward_v1(features, rois, out_h, out_w,
---> 36 spatial_scale, sample_num, output)
37 else:
38 output = roi_align_cuda.forward_v2(features, rois,

RuntimeError: cuda runtime error (209) : unrecognized error code at mmdet/ops/roi_align/src/roi_align_kernel.cu:139

Can anyone help me solve this?? Thanks in advance.

the function of "Table Structure Recognition" folder

Hello, I have a question about the function of "Table Structure Recognition" folder. The "main" file under this folder can generate XML, so the generated XML file is used as the tag data of the model(training the model from scratch)? Or generate XML files about the table structure based on the data predicted by the model?

Borderless tables

Not able to produce any output in case of borderless tables. Is the code for cell masks in case of bordeless tables released or am I missing something ?

mmdetection import library error

Hi Devashish -

In the main.py file, I see the mmdetection import statement as:
from mmdet.apis import inference_detector, show_result, init_detector

The "show_result" must be changed because it has now been renamed as "show_result_pyplot" in mmdetection. The import statement should be as follows:

from mmdet.apis import inference_detector, show_result_pyplot, init_detector

Thanks,
Sekhar H.

How to train your model from scratch?

Hi! Very interesting paper and I am interested in training the model from scratch. Do you have a script available for reference? I am not an expert in object detection and a reference script for full pipeline training would be greatly appreciated.

Thanks.

mmdetection v1.2 won't install without a GPU

Hi - It looks like I can't install mmdetection v1.2 without a GPU even though I installed CUDA10.0 and the appropriate version of cuDNN. Is this understanding correct? Clearly my installtion is failing with the error - "no CUDA-capable device is detected".

I'm unable to find any proper information in open-mmlabs about this subject. However, I'm able to install v2.0 without GPU because I believe 2.0 has a default check to fallback to CPU if a GPU device is not found.

Keeping your model at v1.2 for people who run their projects with no GPU will likely make the usage of these models limited. Is there a way to convert the model trained on v1.2 to v2.0?

Thanks,
Sekhar H.

Training Metrics (Precision, Recall, and F1)

Hello, thank you for the VOC to Coco script. It was very helpful. Would it be fine to ask if you used a custom script to measure model accuracy? Are there any resources you can point where I can get more information?

Post-processing in this test case is so slow

if len(res_border) != 0:
## call border script for each table in image
for res in res_border:
try:
root.append(border(res,cv2.imread(i)))
except:
pass
if len(res_bless) != 0:
if len(res_cell) != 0:
for no,res in enumerate(res_bless):
root.append(borderless(res,cv2.imread(i),res_cell))

test image:https://raw.githubusercontent.com/cndplab-founder/ICDAR2019_cTDaR/master/test/TRACKB2/cTDaR_t10080.jpg

Fine Tuning

Hi, can you please provide scripts used for training for the purpose of fine-tuning the model.

Thanks

name 'etree' is not defined

[Table status] : Processing table with lines
<PIL.Image.Image image mode=RGB size=812x349 at 0x7F9D02BBA860>
<PIL.Image.Image image mode=RGB size=1224x1584 at 0x7F9D012A60B8>
Traceback (most recent call last):
File "CascadeTabNet/Table Structure Recognition/main.py", line 53, in
root.append(borderless(res,cv2.imread(i),res_cell))
File "/content/gdrive/My Drive/CascadeTabNet/CascadeTabNet/Table Structure Recognition/Functions/blessFunc.py", line 365, in borderless
tableXML = etree.Element("table")
NameError: name 'etree' is not defined

CUDA error

Hi I have the right version of Cuda and still getting this issue while running the main file, can you help me with this

Environment :
sys.platform: linux
Python: 3.6.9 (default, Apr 18 2020, 01:56:04) [GCC 8.4.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.0, V10.0.130
GPU 0: Tesla P100-PCIE-16GB
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.4.0+cu100
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.0
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.1
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.5.0+cu100
OpenCV: 4.1.2
MMCV: 0.5.3
MMDetection: 1.2.0+0f33c08
MMDetection Compiler: GCC 7.5
MMDetection CUDA Compiler: 10.0

Error Traceback

Traceback (most recent call last):
File "Table Structure Recognition/main.py", line 23, in
result = inference_detector(model, i)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/apis/inference.py", line 86, in inference_detector
result = model(return_loss=False, rescale=True, **data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/base.py", line 149, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/base.py", line 130, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/cascade_rcnn.py", line 324, in simple_test
self.test_cfg.rpn) if proposals is None else proposals
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/test_mixins.py", line 34, in simple_test_rpn
proposal_list = self.rpn_head.get_bboxes(proposal_inputs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/core/fp16/decorators.py", line 127, in new_func
return old_func(args, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/anchor_heads/anchor_head.py", line 276, in get_bboxes
scale_factor, cfg, rescale)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/anchor_heads/rpn_head.py", line 92, in get_bboxes_single
proposals, _ = nms(proposals, cfg.nms_thr)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/ops/nms/nms_wrapper.py", line 54, in nms
inds = nms_cuda.nms(dets_th, iou_thr)
RuntimeError: CUDA error: no kernel image is available for execution on the device (launch_kernel at /pytorch/aten/src/ATen/native/cuda/Loops.cuh:103)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7fde7991b193 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: void at::native::gpu_index_kernel<_nv_dl_wrapper_t<nv_dl_tag<void (
)(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef)), 1u>> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef, __nv_dl_wrapper_t<_nv_dl_tag<void (
)(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef)), 1u>> const&) + 0x7bb (0x7fde7f58387b in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch.so)

Run a prediction

I am using Pytorch for the first time. I am not able to understand how to run a prediction on an image to get the table. Please guide.

Prediction in colab

Hi, i'm a newbie with mmdetection, i had some issues (mentionned in the open issue's github too) trying to run it on cpu, so i decided to try a prediction in colab but i faced the error below and i could not find the solution alone

image

directory not empty error

No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2'
Traceback (most recent call last):
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmcv\utils\config.py", line 92, in _file2dict
    osp.join(temp_config_dir, temp_config_name))
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 121, in copyfile
    with open(dst, 'wb') as fdst:
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\ABC\\AppData\\Local\\Temp\\tmp2x0tbf44\\tmpg99tl4cb.py'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 17, in <module>
    model = init_detector(config_fname, os.path.join(checkpoint_path, epoch))
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmdet-2.0.0+a9bedfb-py3.6-win-amd64.egg\mmdet\apis\inference.py", line 28, in init_detector
    config = mmcv.Config.fromfile(config)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmcv\utils\config.py", line 165, in fromfile
    cfg_dict, cfg_text = Config._file2dict(filename)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmcv\utils\config.py", line 105, in _file2dict
    temp_config_file.close()
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\tempfile.py", line 809, in __exit__
    self.cleanup()
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\tempfile.py", line 813, in cleanup
    _shutil.rmtree(self.name)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 494, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 393, in _rmtree_unsafe
    onerror(os.rmdir, path, sys.exc_info())
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 391, in _rmtree_unsafe
    os.rmdir(path)
OSError: [WinError 145] The directory is not empty: 'C:\\Users\\ABC\\AppData\\Local\\Temp\\tmp2x0tbf44'


On running main.py with this config file and this model file, I am again and again getting this error

Other Details:
CUDA version : 9.2
OS : Windows
Main.py config

image_path = 'Examples\\cTDaR_t10120.jpg'
xmlPath = 'Examples'


config_fname = "Examples\\faster_rcnn_hrnetv2p_w32_1x_coco.py" 
checkpoint_path = "C:\\Users\\ABC\\Music\\table-structure-rec\\CascadeTabNet\\Table Structure Recognition\\Examples\\"
epoch = 'faster_rcnn_hrnetv2p_w32_1x_coco_20200130-6e286425.pth'


detect results is smoething wrong

when i test an image in Examples files named cTDaR_t10120.jpg, it just detect one table,and cells picture you show is as follows:

and i use epoch_36 to test, besides the code run to " table: [ 323 208 1135 557]
[Table status] : Processing table with lines" it can not exit,stop here all the time
cells

Variation in Results

I was running the evaluation on ICDAR-13 using pre-trained model which you have provided and using the default configuration file.
There is a huge variation in the results.
recall:1.0, precision:0.843, f_measure:0.9216

Are you doing pre-processing on the testing images as well??

temp_lines_ver is None in borderFunc.py

Hi,

first of all, thank you for the great work.

I tried to run main.py in Table Structure Recognition and encountered the problem that temp_lines_ver is set to None in when calling extract_table without the lines parameter in borderFunc.py.

See here

So when iterating over it here, this will obviously throw an exception.

I'm not familiar enough with the code yet. But should be an easy one to fix for someone who is.

Cheers

repair table

mask

the lines on the top is missing ,how can i repaired? thank you

Questions about the prediction of the model

The prediction of the model is a list of 80 arrays. Which one represents the cell bounding boxes and which represents the table bounding boxes? I am interested in extracting the vertices for bounding box of table.

Has an alternative for colab?

I want to debug the code so that good understanding, I originally used vscode but gave up due to my mac did not support cuda so i tried to use colab which however debugging is not good experience to me and each operation is slow.

So have an alternative to replace colab or have you a some tips ?

Hi all,

Hi all,
I'm running demo on Colab but the Run the Predictions aren't success.

Predicted image co-ordinates are lifted uniformly above original cell value.

HI,

output xml coordinates plotted on the input images:
https://prnt.sc/sl1f28
https://prnt.sc/sl1fi5
https://prnt.sc/sl1g42
(All blue lines are manually drawn based on the xml output coordinates for these images)

I have attached the input images which i have used for this model and i have drawn the bounding boxes manually using the XML output.

In all these images the bounding boxes are uniformly lifted upwards. By this I could sense that the image size that I'm sending is altered at the time of prediction and the xml output is having the co-ordinates of the altered image. (please correct me if this is not the case)

If above is true please let me know on where can i get the altered images, so that the xml output co-ordinates would match while plotted.


The next question is, if the above is true, how the table detection coordinates are matching correctly, for all 3 table images attached, apart from cell level, I have also drawn the bounding box for the entire table which fits perfectly.

If all the cell level co-ordinates are realigned how come the table level co-ordinates are alone coming up correctly?

Thanks,
Anand.

Training error

I was trying to train the model on a custom dataset for table detection in my local system with COCO style annotations.

However, I encounter an error while training
mmdet - ERROR - The testing results of the whole dataset is empty.
The evaluation results are all empty on validation data and hence I am not able to generate results as i get empty arrays as ouput.

I am not able able to identify any issue. Any help will be appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.