devashishprasad / cascadetabnet Goto Github PK

This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"

License: MIT License

Python 100.00%

table-recognition table-structure-recognition table-detection table-detection-using-deep-learning

cascadetabnet's Introduction

👋 Hi there

🧬 I am currently harnessing generative AI to address the most complex challenges in Biology for advancements in drug discovery.

🌟 I have published 3 research papers in deep learning /machine learning. One of the papers was published at CVPR and has over 150 citations and 1.3K Github stars.

🎓 In the summer of 2023, I earned my Master of Science in Computer Science, specializing in Machine Learning, from Purdue University, achieving a GPA of 3.6/4.0.

🖥️ During my final year of graduate studies, I served as a Research Assistant at Kihara Lab, one of Purdue University's premier applied ML research facilities. There, I employed generative deep learning models for intricate protein structure analysis and established a machine learning serving infrastructure for these computation-intensive models (em.kiharalab.org).

👻 In the summer of 2022, I joined Snap Inc.'s Camera Platform team as a Machine Learning Engineer intern. My role involved developing and deploying a deep learning-based optical flow prediction model to automate Snapchat's video annotation process. This initiative enhanced video labeling and annotation efficiency by 15% for Snapchat's gigantic unlabelled video datasets.

🛰️ In my first year of graduate study, I collaborated with Viasat Inc. through Purdue's The Data Mine program as a Graduate Data Science Researcher. My research focused on developing Deep Learning algorithms to tackle blind image super-resolution challenges, intended specifically to enhance the quality of Viasat's internal satellite imagery.

🕸️ Before my relocation to the US, I completed five internships in India, focusing on Machine Learning and Deep Learning in the topics of News sentiment analysis, ML in finance, Sports vision analysis, Document understanding, Optical Character Recognition, Face Recognition, Fine-grained image classification, Chatbots, etc.

🏅 I was the Smart India Hackathon (India's biggest Hackathon) grand finalist three times. During which, I worked on ML/DL-based projects for ISRO, ITC Ltd, and DRDO (India's esteemed organizations).

www.devashishprasad.com

cascadetabnet's People

Contributors

Stargazers

Watchers

Forkers

cqray1990 lwzbuaa alwc geekypandey bygreencn duthades aniketgurav guome hyperchi bharatha14 muhammadalishahzad yangyin2016 asa008 zelejs kapitsa2811 dun933 tchigher anyone1006 gednigel dbrainio aaaves gogogeo harirajeev harsit86 mahajantarun vyaslkv datakalp hiteshkalwani hiteshai gyanachand1 dalei22 shubhammittal809 ssttv huguensjean mabounassif nattachaiwat maximelavech nagrao sincewhenucla ukliu 19debanjanbanerjee98 saraswatpuneet yynnxu hyzcn peterferguson mecp ksvkabra joechip-dev chintler jaspernicholfabella kshitijkapadni manishdv subbaraomanchala chetannitk emilstromberg bikramdutta hasanirtiza gunjitbedi askaydevs xiaolaodi vikasmastud7 baifanysu skwn-j a10mic stjordanis kforcodeai rajesh16702 maxcodextc wang91zhe nupy o7s8r6 strangest-quark rmaciel-prog rahul94jh hitman56 law101 thorpham ashiquebiniqbal krishna-22 oboukary iamankr monkidea sweekarsaxena akshay2350 jaykimbravekjh bvlbhargav justusmochache avinashhsinghh oguzkirman aerovikas enjunchoong emmanuelmanana swall0w sagarpomal balighmehrez gehongpeng atsui888 ddw02141 jai2shan infinity73

cascadetabnet's Issues

Unable to fine-tune due to missing mask labels

Hi, I am currently fine-tuning the pre-trained model (epoch36.pth) but I am encountering an error whenever I load my custom dataset generated using LabelImg.

Traceback (most recent call last):
  File "tools/train.py", line 151, in <module>
    main()
  File "tools/train.py", line 147, in main
    meta=meta)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/apis/train.py", line 165, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/usr/local/lib/python3.6/dist-packages/mmcv/runner/runner.py", line 384, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/mmcv/runner/runner.py", line 279, in train
    for i, data_batch in enumerate(data_loader):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 394, in reraise
    raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/custom.py", line 132, in __getitem__
    data = self.prepare_train_img(idx)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/custom.py", line 145, in prepare_train_img
    return self.pipeline(results)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/pipelines/compose.py", line 24, in __call__
    data = t(data)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/pipelines/loading.py", line 147, in __call__
    results = self._load_masks(results)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/pipelines/loading.py", line 125, in _load_masks
    gt_masks = results['ann_info']['masks']
KeyError: 'masks'

I noticed specifically from the config file that the training pipeline requires masks to be enabled.

    dict(type='LoadAnnotations', with_bbox=True, with_mask=True),

Is there something to be done when annotating using LabelImg that you guys did differently to indicate the existence of label masks? I saw the example provided and did the same but still getting an error about masks. I also set with_mask=False but I don't honestly know how relevant would that be to the whole training process.

Example annotation from LabelImg:

<annotation>
	<folder>jpeg_images</folder>
	<filename>acc_2018_fs_008.jpg</filename>
	<path>/Users/rt/Desktop/99_annotated/jpeg_images/acc_2018_fs_008.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>4958</width>
		<height>7017</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
        <name>borderless</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>993</xmin>
            <ymin>1020</ymin>
            <xmax>4223</xmax>
            <ymax>5479</ymax>
        </bndbox>
    </object>
	<object>
		<name>cell</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>3559</xmin>
			<ymin>1047</ymin>
			<xmax>4021</xmax>
			<ymax>1107</ymax>
		</bndbox>
	</object>
</annotation>

Thank you and I appreciate this awesome work by the way.

Table structure recognition is not predicted for second table of demo image

First and foremost, thanks for this interesting paper and also this repository!

Now, as you can see in the README, in the demo gif not only both tables are detected but structure recognition is successful for both tables (in the last step of the animation).

However, when predicting this demo image, I get the different results:

As you can see in the screenshot, both tables are detected succesfully. But in the right table no cell is recognised. In the left table, cells in the last columns are also not recognised. I'm using the same checkpoint file and configuration as in the demo Jupyter notebook. I tried lowering the threshold, but that didn't help.

How can I improve the prediction so that I get the same performance as shown in the demo gif? Am I missing some postprocessing, or am I not using the optimal configuration, or something else? I'm not sure, I hope you could help.

Thanks!

ModuleNotFoundError: No module named 'mmdet'

I setuped with following command.

pip install -q mmcv terminaltables
git clone --branch v1.2.0 'https://github.com/open-mmlab/mmdetection.git'
cd "mmdetection"
python setup.py install
python setup.py develop
pip install -r {"requirements.txt"}

but running this project using colab but encounted this error, what was wrong?
https://colab.research.google.com/drive/1WLNZVaKPMgRW-YGk4mrHu6965__ppRMV#scrollTo=QlL7jDQ6Q40I

XML output of extracted tabular text

Hi Devashish -

For reference, is it possible to upload the XML output results of extracted tabular text for a few example documents?

Thanks,
Sekhar H.

cascade_mask_rcnn_hrnetv2p_w32_20e.py

when i read cascade_mask_rcnn_hrnetv2p_w32_20e.py i found numclass=81,why you set numclass=81,this is not correct?is it?

RuntimeError: cuda runtime error (209) : unrecognized error code at mmdet/ops/roi_align/src/roi_align_kernel.cu:139

I'm trying to run ICDAR-13 model. But I'm getting this error.

RuntimeError Traceback (most recent call last)

in ()
10
11 # Run Inference
---> 12 result = inference_detector(model, img)
13
14 # Visualization results

11 frames

/content/drive/My Drive/mmdetection/mmdet/apis/inference.py in inference_detector(model, img)
84 # forward the model
85 with torch.no_grad():
---> 86 result = model(return_loss=False, rescale=True, **data)
87 return result
88

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/content/drive/My Drive/mmdetection/mmdet/core/fp16/decorators.py in new_func(*args, **kwargs)
47 'method of nn.Module')
48 if not (hasattr(args[0], 'fp16_enabled') and args[0].fp16_enabled):
---> 49 return old_func(*args, **kwargs)
50 # get the arg spec of the decorated method
51 args_info = getfullargspec(old_func)

/content/drive/My Drive/mmdetection/mmdet/models/detectors/base.py in forward(self, img, img_metas, return_loss, **kwargs)
147 return self.forward_train(img, img_metas, **kwargs)
148 else:
--> 149 return self.forward_test(img, img_metas, **kwargs)
150
151 def show_result(self, data, result, dataset=None, score_thr=0.3):

/content/drive/My Drive/mmdetection/mmdet/models/detectors/base.py in forward_test(self, imgs, img_metas, **kwargs)
128 if 'proposals' in kwargs:
129 kwargs['proposals'] = kwargs['proposals'][0]
--> 130 return self.simple_test(imgs[0], img_metas[0], **kwargs)
131 else:
132 # TODO: support test augmentation for predefined proposals

/content/drive/My Drive/mmdetection/mmdet/models/detectors/cascade_rcnn.py in simple_test(self, img, img_metas, proposals, rescale)
340
341 bbox_feats = bbox_roi_extractor(
--> 342 x[:len(bbox_roi_extractor.featmap_strides)], rois)
343 if self.with_shared_head:
344 bbox_feats = self.shared_head(bbox_feats)

/content/drive/My Drive/mmdetection/mmdet/core/fp16/decorators.py in new_func(*args, **kwargs)
125 'method of nn.Module')
126 if not (hasattr(args[0], 'fp16_enabled') and args[0].fp16_enabled):
--> 127 return old_func(*args, **kwargs)
128 # get the arg spec of the decorated method
129 args_info = getfullargspec(old_func)

/content/drive/My Drive/mmdetection/mmdet/models/roi_extractors/single_level.py in forward(self, feats, rois, roi_scale_factor)
103 if inds.any():
104 rois_ = rois[inds, :]
--> 105 roi_feats_t = self.roi_layers[i](feats[i], rois_)
106 roi_feats[inds] = roi_feats_t
107 return roi_feats

/content/drive/My Drive/mmdetection/mmdet/ops/roi_align/roi_align.py in forward(self, features, rois)
142 else:
143 return roi_align(features, rois, self.out_size, self.spatial_scale,
--> 144 self.sample_num, self.aligned)
145
146 def repr(self):

/content/drive/My Drive/mmdetection/mmdet/ops/roi_align/roi_align.py in forward(ctx, features, rois, out_size, spatial_scale, sample_num, aligned)
34 out_w)
35 roi_align_cuda.forward_v1(features, rois, out_h, out_w,
---> 36 spatial_scale, sample_num, output)
37 else:
38 output = roi_align_cuda.forward_v2(features, rois,

RuntimeError: cuda runtime error (209) : unrecognized error code at mmdet/ops/roi_align/src/roi_align_kernel.cu:139

Can anyone help me solve this?? Thanks in advance.

checkpiont can not be loaded and dataset also

the dataset and checkpoint can not be loaded successful

May i know the cuda version that has been installed?

the function of "Table Structure Recognition" folder

Hello, I have a question about the function of "Table Structure Recognition" folder. The "main" file under this folder can generate XML, so the generated XML file is used as the tag data of the model（training the model from scratch）? Or generate XML files about the table structure based on the data predicted by the model?

Borderless tables

Not able to produce any output in case of borderless tables. Is the code for cell masks in case of bordeless tables released or am I missing something ?

Vertical Lines in Borderless Tables

I ran model into my own dataset and results were like this for border less tables. Am I mis
sing something?

Is there any training model available to train the model to improve accuracy?

I could see that the model i not generalising well in the unseen data, so if there is any training model available it would be easy to improve model via training.

Please find the results for some sample tables with the respect to the 7 given models.
results.zip

mmdetection import library error

Hi Devashish -

In the main.py file, I see the mmdetection import statement as:
from mmdet.apis import inference_detector, show_result, init_detector

The "show_result" must be changed because it has now been renamed as "show_result_pyplot" in mmdetection. The import statement should be as follows:

from mmdet.apis import inference_detector, show_result_pyplot, init_detector

Thanks,
Sekhar H.

How to train your model from scratch?

Hi! Very interesting paper and I am interested in training the model from scratch. Do you have a script available for reference? I am not an expert in object detection and a reference script for full pipeline training would be greatly appreciated.

Thanks.

mmdetection v1.2 won't install without a GPU

Hi - It looks like I can't install mmdetection v1.2 without a GPU even though I installed CUDA10.0 and the appropriate version of cuDNN. Is this understanding correct? Clearly my installtion is failing with the error - "no CUDA-capable device is detected".

I'm unable to find any proper information in open-mmlabs about this subject. However, I'm able to install v2.0 without GPU because I believe 2.0 has a default check to fallback to CPU if a GPU device is not found.

Keeping your model at v1.2 for people who run their projects with no GPU will likely make the usage of these models limited. Is there a way to convert the model trained on v1.2 to v2.0?

Thanks,
Sekhar H.

Training Metrics (Precision, Recall, and F1)

Hello, thank you for the VOC to Coco script. It was very helpful. Would it be fine to ask if you used a custom script to measure model accuracy? Are there any resources you can point where I can get more information?

no training script?

I have not found training script in the git repo?

Is there any plan to rewrite with tensorflow

Is there any plan to develop with tensorflow?

Error: cannot connect to X server (running the code in Azure (cloud) Linux machine)

Empty XML files as output

I was trying to run the code but it is returning empty XML files.

Input file

Output XML file
cTDaR_t10011.txt

Post-processing in this test case is so slow

CascadeTabNet/Table Structure Recognition/main.py

Lines 43 to 53 in 7147b41

 if len(res_border) != 0: 

 ## call border script for each table in image 

 for res in res_border: 

 try: 

 root.append(border(res,cv2.imread(i))) 

 except: 

 pass 

 if len(res_bless) != 0: 

 if len(res_cell) != 0: 

 for no,res in enumerate(res_bless): 

 root.append(borderless(res,cv2.imread(i),res_cell))

test image:https://raw.githubusercontent.com/cndplab-founder/ICDAR2019_cTDaR/master/test/TRACKB2/cTDaR_t10080.jpg

Do i need cuda support in production

How to deploy in production? My server is aliyun which has not gpu card.

May i deploy to mobile device with tflite?

How to deploy to mobile device if it is possible to deploy mobile device with tflite?

Fine Tuning

Hi, can you please provide scripts used for training for the purpose of fine-tuning the model.

Thanks

name 'etree' is not defined

[Table status] : Processing table with lines
<PIL.Image.Image image mode=RGB size=812x349 at 0x7F9D02BBA860>
<PIL.Image.Image image mode=RGB size=1224x1584 at 0x7F9D012A60B8>
Traceback (most recent call last):
File "CascadeTabNet/Table Structure Recognition/main.py", line 53, in
root.append(borderless(res,cv2.imread(i),res_cell))
File "/content/gdrive/My Drive/CascadeTabNet/CascadeTabNet/Table Structure Recognition/Functions/blessFunc.py", line 365, in borderless
tableXML = etree.Element("table")
NameError: name 'etree' is not defined

CUDA error

Hi I have the right version of Cuda and still getting this issue while running the main file, can you help me with this

Environment :
sys.platform: linux
Python: 3.6.9 (default, Apr 18 2020, 01:56:04) [GCC 8.4.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.0, V10.0.130
GPU 0: Tesla P100-PCIE-16GB
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.4.0+cu100
PyTorch compiling details: PyTorch built with:

GCC 7.3
Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CUDA Runtime 10.0
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
CuDNN 7.6.3
Magma 2.5.1
Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.5.0+cu100
OpenCV: 4.1.2
MMCV: 0.5.3
MMDetection: 1.2.0+0f33c08
MMDetection Compiler: GCC 7.5
MMDetection CUDA Compiler: 10.0

Error Traceback

Traceback (most recent call last):
File "Table Structure Recognition/main.py", line 23, in
result = inference_detector(model, i)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/apis/inference.py", line 86, in inference_detector
result = model(return_loss=False, rescale=True, **data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/base.py", line 149, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/base.py", line 130, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/cascade_rcnn.py", line 324, in simple_test
self.test_cfg.rpn) if proposals is None else proposals
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/test_mixins.py", line 34, in simple_test_rpn
proposal_list = self.rpn_head.get_bboxes(proposal_inputs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/core/fp16/decorators.py", line 127, in new_func
return old_func(args, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/anchor_heads/anchor_head.py", line 276, in get_bboxes
scale_factor, cfg, rescale)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/anchor_heads/rpn_head.py", line 92, in get_bboxes_single
proposals, _ = nms(proposals, cfg.nms_thr)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/ops/nms/nms_wrapper.py", line 54, in nms
inds = nms_cuda.nms(dets_th, iou_thr)
RuntimeError: CUDA error: no kernel image is available for execution on the device (launch_kernel at /pytorch/aten/src/ATen/native/cuda/Loops.cuh:103)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7fde7991b193 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: void at::native::gpu_index_kernel<_nv_dl_wrapper_t<nv_dl_tag<void ()(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef)), 1u>> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef, __nv_dl_wrapper_t<_nv_dl_tag<void ()(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef)), 1u>> const&) + 0x7bb (0x7fde7f58387b in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch.so)

Run a prediction

I am using Pytorch for the first time. I am not able to understand how to run a prediction on an image to get the table. Please guide.

What is the python version that is used?

How to create traindata label ?annotate lines like verical line horizental line or the whole table?

Prediction in colab

Hi, i'm a newbie with mmdetection, i had some issues (mentionned in the open issue's github too) trying to run it on cpu, so i decided to try a prediction in colab but i faced the error below and i could not find the solution alone

mmdetection version and config file

which version of mmdetection are you using and what is the config file that goes with the checkpoints?

not supporting cuda

How to build mmdetection without cuda supporting?

directory not empty error

No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2'
Traceback (most recent call last):
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmcv\utils\config.py", line 92, in _file2dict
    osp.join(temp_config_dir, temp_config_name))
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 121, in copyfile
    with open(dst, 'wb') as fdst:
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\ABC\\AppData\\Local\\Temp\\tmp2x0tbf44\\tmpg99tl4cb.py'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 17, in <module>
    model = init_detector(config_fname, os.path.join(checkpoint_path, epoch))
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmdet-2.0.0+a9bedfb-py3.6-win-amd64.egg\mmdet\apis\inference.py", line 28, in init_detector
    config = mmcv.Config.fromfile(config)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmcv\utils\config.py", line 165, in fromfile
    cfg_dict, cfg_text = Config._file2dict(filename)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmcv\utils\config.py", line 105, in _file2dict
    temp_config_file.close()
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\tempfile.py", line 809, in __exit__
    self.cleanup()
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\tempfile.py", line 813, in cleanup
    _shutil.rmtree(self.name)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 494, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 393, in _rmtree_unsafe
    onerror(os.rmdir, path, sys.exc_info())
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 391, in _rmtree_unsafe
    os.rmdir(path)
OSError: [WinError 145] The directory is not empty: 'C:\\Users\\ABC\\AppData\\Local\\Temp\\tmp2x0tbf44'

On running main.py with this config file and this model file, I am again and again getting this error

Other Details:
CUDA version : 9.2
OS : Windows
Main.py config

image_path = 'Examples\\cTDaR_t10120.jpg'
xmlPath = 'Examples'


config_fname = "Examples\\faster_rcnn_hrnetv2p_w32_1x_coco.py" 
checkpoint_path = "C:\\Users\\ABC\\Music\\table-structure-rec\\CascadeTabNet\\Table Structure Recognition\\Examples\\"
epoch = 'faster_rcnn_hrnetv2p_w32_1x_coco_20200130-6e286425.pth'

detect results is smoething wrong

when i test an image in Examples files named cTDaR_t10120.jpg, it just detect one table,and cells picture you show is as follows:

and i use epoch_36 to test, besides the code run to " table: [ 323 208 1135 557]
[Table status] : Processing table with lines" it can not exit,stop here all the time

Variation in Results

I was running the evaluation on ICDAR-13 using pre-trained model which you have provided and using the default configuration file.
There is a huge variation in the results.
recall:1.0, precision:0.843, f_measure:0.9216

Are you doing pre-processing on the testing images as well??

Error: cannot connect to X server

Could you please me in addressing the below issue?

temp_lines_ver is None in borderFunc.py

Hi,

first of all, thank you for the great work.

I tried to run main.py in Table Structure Recognition and encountered the problem that temp_lines_ver is set to None in when calling extract_table without the lines parameter in borderFunc.py.

See here

So when iterating over it here, this will obviously throw an exception.

I'm not familiar enough with the code yet. But should be an easy one to fix for someone who is.

Cheers

repair table

the lines on the top is missing ，how can i repaired？ thank you

Just wanted to let you guys know

I'm currently using the table detection feature for the proof of concept of my table transformation bachelor thesis. Great work!

Will it work in Windows?

Getting this error in windows: https://prnt.sc/sjhfw7

max() arg is an empty sequence

Hi, I encountered this problem on testing the model epoch_13

on this image

Questions about the prediction of the model

The prediction of the model is a list of 80 arrays. Which one represents the cell bounding boxes and which represents the table bounding boxes? I am interested in extracting the vertices for bounding box of table.

Has an alternative for colab?

I want to debug the code so that good understanding, I originally used vscode but gave up due to my mac did not support cuda so i tried to use colab which however debugging is not good experience to me and each operation is slow.

So have an alternative to replace colab or have you a some tips ?

Whether to plan to develop a tensorflow version

I am very hopefully that you to develop a tensorflow version.

Hi all,

Hi all,
I'm running demo on Colab but the Run the Predictions aren't success.

Predicted image co-ordinates are lifted uniformly above original cell value.

HI,

output xml coordinates plotted on the input images:
https://prnt.sc/sl1f28
https://prnt.sc/sl1fi5
https://prnt.sc/sl1g42
(All blue lines are manually drawn based on the xml output coordinates for these images)

I have attached the input images which i have used for this model and i have drawn the bounding boxes manually using the XML output.

In all these images the bounding boxes are uniformly lifted upwards. By this I could sense that the image size that I'm sending is altered at the time of prediction and the xml output is having the co-ordinates of the altered image. (please correct me if this is not the case)

If above is true please let me know on where can i get the altered images, so that the xml output co-ordinates would match while plotted.

The next question is, if the above is true, how the table detection coordinates are matching correctly, for all 3 table images attached, apart from cell level, I have also drawn the bounding box for the entire table which fits perfectly.

If all the cell level co-ordinates are realigned how come the table level co-ordinates are alone coming up correctly?

Thanks,
Anand.

Getting this error in windows: https://prnt.sc/sjhfw7

Training error

I was trying to train the model on a custom dataset for table detection in my local system with COCO style annotations.

However, I encounter an error while training
mmdet - ERROR - The testing results of the whole dataset is empty.
The evaluation results are all empty on validation data and hence I am not able to generate results as i get empty arrays as ouput.

I am not able able to identify any issue. Any help will be appreciated.

	if len(res_border) != 0:
	## call border script for each table in image
	for res in res_border:
	try:
	root.append(border(res,cv2.imread(i)))
	except:
	pass
	if len(res_bless) != 0:
	if len(res_cell) != 0:
	for no,res in enumerate(res_bless):
	root.append(borderless(res,cv2.imread(i),res_cell))

devashishprasad / cascadetabnet Goto Github PK

cascadetabnet's Introduction

👋 Hi there

cascadetabnet's People

Contributors

Stargazers

Watchers

Forkers

cascadetabnet's Issues

Recommend Projects

Recommend Topics

Recommend Org