Comments (4)
One problem I noticed while cutting out the tables, that is not in this list (or I miss where to put it) is that there is conflation of content in the PDF.
For example, in 10.1016_j.pain.2014.08.023, Table 1, the content of the table is conjoined with the page number at the top of the page. The first image below shows the selection, the image after shows when that selection is deleted.
See data/metadata.csv for several other papers that have this.
Selection
Selection deleted
from cm-ucl.
We should reselect the table and omit the page number. I don't see any reason why the content should disappear. What happens in oa-pmr?
from cm-ucl.
Not a problem in corpus-oa-pmr
:-)
from cm-ucl.
This is out of scope for the current cm-ucl
project.
from cm-ucl.
Related Issues (20)
- Font styles
- Development Strategy
- Metrics
- Human Classification of Table Types
- csv,conf,v3 talk HOT 5
- AMI-pdf2svg and extraction of tables HOT 1
- Development Corpus HOT 2
- PDF2SVG conversion HOT 1
- Pipeline/stack build logbook HOT 6
- Excutable jar files
- Include title and footer in table
- Header not created properly for nested column headers
- Align header columns with body columns
- Continuation tables
- Rotated tables
- Output CSV HOT 1
- Legacy Fonts
- Inappropriate codepoints
- Wrong characters HOT 1
- Font Weights
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cm-ucl.