Git Product home page Git Product logo

obsidian-marker's Introduction

title-banner

Maintenance GitHub issues GitHub Release GitHub License Marker API

๐ŸŒŸ Introduction

Welcome to the Obsidian PDF to Markdown Converter! This extension brings the power of advanced PDF conversion directly into your Obsidian vault. By leveraging the capabilities of the Marker API, this plugin offers a seamless way to transform your PDFs into rich, formatted Markdown files.

Important

This extension requires a Marker API endpoint to function. Without an endpoint, the application won't work.

You can find the related repositories here:

๐Ÿš€ Features

  • OCR Capabilities: Convert scanned PDFs to searchable text
  • Formula Detection: Accurately captures and converts mathematical formulas
  • Table Extraction: Preserves table structures in your Markdown output
  • Image Handling: Extracts and saves images from your PDFs
  • Mobile Compatibility: Works on both desktop and mobile Obsidian apps
  • Flexible Output: Choose between full content extraction or specific elements (text/images)

๐Ÿ›  Why This Extension?

  1. Superior Extraction: Utilizes the Marker project's advanced AI model for high-quality conversions
  2. Mobile Accessibility: Unlike many converters, this works seamlessly on mobile devices
  3. Customizable: Tailor the conversion process to your specific needs
  4. Obsidian Integration: Converts PDFs directly within your Obsidian environment

โ™ฅ๏ธ Support the Project

If you enjoy this extension, feel free to star the repository and share it with others! When you want to support the development, consider buying me a coffee:

๐Ÿ“‹ Requirements

To use this extension, you'll need:

  1. An Obsidian vault
  2. Access to a Marker API endpoint (self-hosted or paid service)

๐Ÿ”ง Setup

  1. Install the extension in your Obsidian vault

  2. Configure your Marker API endpoint in the plugin settings

  3. (Optional) Set up a self-hosted Marker API:

    • Use Docker on a machine with a solid GPU/CPU
    • (Optional) Make the endpoint available to other devices (e.g., using Tailscale)
    • Alternatively, host in the cloud or run the Python server as needed

โš™๏ธ Settings

Setting Default Description
markerEndpoint 'localhost:8000' The URL of your Marker API endpoint
createFolder true Bundle all output files in a folder
movePDFtoFolder false Move the original PDF to the output folder
createAssetSubfolder true Create a subfolder for extracted assets (images, etc.)
extractContent 'all' Options: 'Extract everything', 'Text Only', 'Images Only'
writeMetadata true Include metadata in the converted Markdown file
deleteOriginal false Delete the original PDF after conversion

๐Ÿ™ Acknowledgements

This extension wouldn't be possible without the incredible work of:

  • Marker Project: The AI model powering the conversions
  • Marker API: The API that enables self-hosting of the conversion service

A huge thank you to these projects for their contributions to the community!

๐Ÿ› Troubleshooting

If you encounter issues related to the extension itself, please open an issue in this repository. For problems with the conversion process or API, please refer to the Marker and Marker API repositories.

๐Ÿค Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the issues page.


Happy converting! ๐Ÿ“šโžก๏ธ๐Ÿ“


Star History Chart

obsidian-marker's People

Contributors

l3-n0x avatar

Stargazers

bageltoes avatar Chuanfu avatar  avatar  avatar  avatar Cam Ashton avatar Wyatt Walsh avatar Otto avatar  avatar Afz avatar  avatar  avatar Pierre Haufe avatar

Watchers

 avatar

obsidian-marker's Issues

Datalab API not supported

Very sorry if I am a dummy. I really don't know what I am doing. I am trying to configure the endpoint in the settings. I'm using the data lab api but it doesn't connect. I checked that the service is working fine. How do I properly format the endpoint in the settings page on obsidian?

Get an error when posting pdf

(task, pid=1465) INFO: 127.0.0.1:38582 - "POST /convert HTTP/1.1" 200 OK (task, pid=1465) Could not determine filetype for b''

When using terminal curl comand, everything works just fine (for the same pdf).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.