Comments (5)
It seems that PyPDF2 (which camelot depends on) has implemented a breaking change on its API.
Short term fix is to manually install the last PyPDF2 version that works with the old API (before 3.0.0, as per the error message) after you have installed camelot:
python -m pip install "pypdf2<3"
Also, PyPDF2 is changing to PyPDF...this should be taken in account for the future. From their pypi page:
NOTE: The PyPDF2 project is going back to its roots. PyPDF2==3.0.X will be the last version of PyPDF2. Development will continue with pypdf==3.1.0.
from camelot.
@juliatong , can you check with python -m pip freeze
what version of pypdf2 is installed? From your error message it seems it's still version 3.
Maybe the previous version was installed in a different virtual environment? Also, if you are using Jupyter notebooks maybe you need to quit and restart the kernel for the updated library to be loaded.
Anyway, this should not be a long term solution. Not sure if any maintainer is working on this?
from camelot.
I have the same problem, have resolved this??
from camelot.
python -m pip install "pypdf2<3". Yet error remained.
Successfully installed pypdf2-2.12.1
camelot-py 0.9.0
DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.
from camelot.
@juliatong , can you check with
python -m pip freeze
what version of pypdf2 is installed? From your error message it seems it's still version 3. Maybe the previous version was installed in a different virtual environment? Also, if you are using Jupyter notebooks maybe you need to quit and restart the kernel for the updated library to be loaded. Anyway, this should not be a long term solution. Not sure if any maintainer is working on this?
Hi @paluigi,
Thanks a lot to the reply.
I solved the problem!
First of all, you are right. Indeed my jupter notebook wasn't picking up the change, despite Successfully installed pypdf2-2.12.1. It seems the kernel still took version 3 from the error msg you as point out. restarted the kernel, and the error msg is gone. Big shout on your attention to details.
However, after it, a new error came up. _raise RuntimeError('Ghostscript is not installed') RuntimeError: Please make sure that Ghostscript is installed .While I pip show Ghostscript is indeed there...
solution is to run commands below.
_sudo apit gives error _raise RuntimeError('Ghostscript is not installed') RuntimeError: Please make sure that Ghostscript is installed .While I pip show Ghostscript is indeed there...
solution is to run commands below.
sudo apt-get update
sudo at-get update
sudo apt-get install ghostscript
Here is the thing. The ghostscript package I installed through pip install is a Python interface to the Ghostscript C-API, and it doesn't include the Ghostscript executable itself. The Python package interacts with the Ghostscript library but doesn't install the Ghostscript command-line executable (gs). That's why above lines resolved my problem as they manually install the executable.
from camelot.
Related Issues (20)
- The publish release in github action failed HOT 1
- Could not able to extract all columns
- two bug!
- Installing camelot with "pip install -U 'camelot-py[base]'" installs version 0.9, instead of 0.11 HOT 2
- Match size of Lines mask with Image Table
- mac m1 Ghostscript is not installed. HOT 2
- Difficulties with Multi-line headers. Rows shifted down. HOT 1
- OSS-Fuzz Integration
- Error in PyPDF2 3.0.0 HOT 4
- Updated documentation idea / installation screencasts HOT 2
- Release 0.11.0 uses deprecated pandas encoding parameter
- [Feature Request] Replace text
- Strip more than 1 string
- Test failures on ppc64el (PowerPC architecture), linux
- [Feature Request / Question] Use different OCR engine
- fail when detect abnormal border table HOT 1
- IndexError in lattice HOT 1
- Tables ignored in lattice mode HOT 1
- if (bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8: ZeroDivisionError: float division by zero HOT 1
- How to combine tabular and non-tabular content from a PDF?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from camelot.