Git Product home page Git Product logo

pypdf2's Introduction

PyPI version Python Support GitHub last commit codecov

PyPDF2

PyPDF2 is a free and open-source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. PyPDF2 can retrieve text and metadata from PDFs as well.

Installation

You can install PyPDF2 via pip:

pip install PyPDF2

If you plan to use PyPDF2 for encrypting or decrypting PDFs that use AES, you will need to install some extra dependencies. Encryption using RC4 is supported using the regular installation.

pip install PyPDF2[crypto]

Usage

from PyPDF2 import PdfReader

reader = PdfReader("example.pdf")
number_of_pages = len(reader.pages)
page = reader.pages[0]
text = page.extract_text()

PyPDF2 can do a lot more, e.g. splitting, merging, reading and creating annotations, decrypting and encrypting, and more.

Please see the documentation for more usage examples!

A lot of questions are asked and answered on StackOverflow.

Contributions

Maintaining PyPDF2 is a collaborative effort. You can support PyPDF2 by writing documentation, helping to narrow down issues, and adding code.

Q&A

The experience PyPDF2 users have covers the whole range from beginners who want to make their live easier to experts who developed software before PDF existed. You can contribute to the PyPDF2 community by answering questions on StackOverflow, helping in discussions, and asking users who report issues for MCVE's (Code + example PDF!).

Issues

A good bug ticket includes a MCVE - a minimal complete verifiable example. For PyPDF2, this means that you must upload a PDF that causes the bug to occur as well as the code you're executing with all of the output. Use print(PyPDF2.__version__) to tell us which version you're using.

Code

All code contributions are welcome, but smaller ones have a better chance to get included in a timely manner. Adding unit tests for new features or test cases for bugs you've fixed help us to ensure that the Pull Request (PR) is fine.

PyPDF2 includes a test suite which can be executed with pytest:

$ pytest
===================== test session starts =====================
platform linux -- Python 3.6.15, pytest-7.0.1, pluggy-1.0.0
rootdir: /home/moose/Github/Martin/PyPDF2
plugins: cov-3.0.0
collected 233 items

tests/test_basic_features.py ..                         [  0%]
tests/test_constants.py .                               [  1%]
tests/test_filters.py .................x.....           [ 11%]
tests/test_generic.py ................................. [ 25%]
.............                                           [ 30%]
tests/test_javascript.py ..                             [ 31%]
tests/test_merger.py .                                  [ 32%]
tests/test_page.py .........................            [ 42%]
tests/test_pagerange.py ................                [ 49%]
tests/test_papersizes.py ..................             [ 57%]
tests/test_reader.py .................................. [ 72%]
...............                                         [ 78%]
tests/test_utils.py ....................                [ 87%]
tests/test_workflows.py ..........                      [ 91%]
tests/test_writer.py .................                  [ 98%]
tests/test_xmp.py ...                                   [100%]

========== 232 passed, 1 xfailed, 1 warning in 4.52s ==========

pypdf2's People

Contributors

martinthoma avatar mstamy2 avatar knowah avatar pubpub-zz avatar masterodin avatar switham avatar mozbugbox avatar sylvainpelissier avatar jamma313 avatar mtd91429 avatar exiledkingcc avatar hatell avatar kushal-kumaran avatar henrykeiter avatar caxap avatar moshekaplan avatar rob1080 avatar vfigueiro avatar egbutter avatar vashek avatar jasonbot avatar anthony-tuininga avatar chrishiestand avatar dylanmc avatar yegorlitvinov avatar flyser avatar snorfalorpagus avatar jerem avatar louib avatar manuelzs avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.