Git Product home page Git Product logo

download-aptnotes's Introduction

Download APTNotes

Download and (optionally) parse APTNotes quickly and easily

Installation

pip install download-aptnotes

To enable parsing the downloaded PDFs you need to install the extra tika. This will try to install the Apache Tika Server which depends on Java 7+. Make sure that you have an adequate version of Java installed before you try to install it Without this extra, the only output format available is pdf.

pip install download-aptnotes[tika]

Usage

Usage: download-aptnotes [OPTIONS]

  Download and (optionally) parse APTNotes quickly and easily

Options:
  -f, --format [pdf|sqlite|json|csv]
                                  Output format  [required]
  -o, --output PATH               Output path of file or directory  [required]
  -l, --limit INTEGER             Number of files to download
  -p, --parallel INTEGER          Number of parallell downloads  [default: 10]
  --install-completion            Install completion for the current shell.
  --show-completion               Show completion for the current shell, to
                                  copy it or customize the installation.

  --help                          Show this message and exit.

Download all documents, parse them and store them in an SQLite database:

download-aptnotes -f sqlite -o aptnotes.sqlite

Download the first 10 documents in the source list, parse them and store them in an SQLite database:

download-aptnotes -f sqlite -o aptnotes.sqlite -l 10

Download all documents and store them as individual files in a directory:

download-aptnotes -f pdf -o aptnotes/

Contributing

Dependencies:

  • Java 7+
  • Poetry

Clone this repository and install all dependencies:

git clone https://github.com/nikstur/download-aptnotes.git
cd download-aptnotes
poetry install

download-aptnotes's People

Contributors

nikstur avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.