Git Product home page Git Product logo

pikepdf's Introduction

pikepdf

pikepdf is a Python library for reading and writing PDF files.

Build Status PyPI PyPI - Python Version PyPy Language grade: Python Language grade: C/C++ PyPI - License PyPI - Downloads codecov

pikepdf is based on QPDF, a powerful PDF manipulation and repair library.

Python + QPDF = "py" + "qpdf" = "pyqpdf", which looks like a dyslexia test. Say it out loud, and it sounds like "pikepdf".

# Elegant, Pythonic API
with pikepdf.open('input.pdf') as pdf:
    num_pages = len(pdf.pages)
    del pdf.pages[-1]
    pdf.save('output.pdf')

To install:

pip install pikepdf

For users who want to build from source, see installation.

pikepdf is documented and actively maintained. Commercial support is available. We support just about everything x86-64, including PyPy, and Apple Silicon on a best effort basis.

Features

This library is similar to PyPDF2 and pdfrw - it provides low level access to PDF features and allows editing and content transformation of existing PDFs. Some knowledge of the PDF specification may be helpful. It does not have the capability to render a PDF to image.

Feature pikepdf PyPDF2 pdfrw
Editing, manipulation and transformation of existing PDFs
Based on an existing, mature PDF library QPDF
Implementation C++ and Python Python Python
PDF versions supported 1.1 to 1.7 1.3? 1.7
Python versions supported 3.7-3.10 1 2.7-3.10 2.6-3.6
Save and load password protected (encrypted) PDFs ✔ (except public key) ✘ (Only obsolete RC4) ✘ (not at all)
Save and load PDF compressed object streams (PDF 1.5)
Creates linearized ("fast web view") PDFs
Actively maintained pikepdf commit activity PyPDF2 commit activity pdfrw commit activity
Test suite coverage codecov codecovpypdf2 unknown
Creates PDFs that pass PDF validation tests ?
Modifies PDF/A without breaking PDF/A compliance ?
Automatically repairs PDFs with internal errors
PDF XMP metadata editing read-only
Documentation
Integrates with Jupyter and IPython notebooks for rapid development

Testimonials

I decided to try writing a quick Python program with pikepdf to automate [something] and it "just worked". –Jay Berkenbilt, creator of QPDF

"Thanks for creating a great pdf library, I tested out several and this is the one that was best able to work with whatever I threw at it." –@cfcurtis

In Production

  • OCRmyPDF uses pikepdf to graft OCR text layers onto existing PDFs, to examine the contents of input PDFs, and to optimize PDFs.

  • pdfarranger is a small Python application that provides a graphical user interface to rotate, crop and rearrange PDFs.

  • PDFStitcher is a utility for stitching PDF pages into a single document (i.e. N-up or page imposition).

License

pikepdf is provided under the Mozilla Public License 2.0 license (MPL) that can be found in the LICENSE file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license.

Informally, MPL 2.0 is a not a "viral" license. It may be combined with other work, including commercial software. However, you must disclose your modifications to pikepdf in source code form. In other works, fork this repository on GitHub or elsewhere and commit your contributions there, and you've satisfied your obligations. MPL 2.0 is compatible with the GPL and LGPL - see the guidelines for notes on use in GPL.

The debian/copyright file describes licensing terms for the test suite and the provenance of test resources.

Footnotes

  1. pikepdf 3.x and older support Python 3.6.

pikepdf's People

Contributors

jbarlow83 avatar sylvaincorlay avatar merll avatar mara004 avatar dean0x7d avatar lucas-c avatar sjahu avatar wjakob avatar cherryblossom000 avatar lamby avatar knobix avatar m-holger avatar mgorny avatar jugmac00 avatar jberkenbilt avatar qulogic avatar dreua avatar kraptor avatar phillipsmjordan avatar stephengroat avatar micparke avatar martinthoma avatar mstarzyk avatar maxwell-k avatar vidiecan avatar edwardbetts avatar darwing1210 avatar clpo13 avatar cjmayo avatar ahmednazir avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.