Git Product home page Git Product logo

ai_pdf's Introduction

Chat locally with any PDF

Ask questions, get answer with usefull references

Work well with math pdfs (convert them to LaTex, a math syntax comprehensible by computer)

Work flow chart

RAG_diagrams

Demos

chatbot test with some US Laws pdf

test_ai_top_secret.mp4

chatbot test with math pdf (interpereted as latex by the LLM)

test_math.mp4

full length process of converting pdf to latex, then using the chat bot

full_test_math.mp4

How to use

  • Clone the project to some location that we will call 'x'

  • install requierements listed in the requirements.txt file

  • (open terminal, go to the 'x' location, run pip install -r requirements.txt)

  • ([OPTIONAL] for better performance during embedding, install pytorch with cuda, go to https://pytorch.org/get-started/locally/)

  • Put your pdfs in x/ai_pdf/documents/pdfs

  • Run x/ai_pdf/main.py

  • Select or not math mode

  • Choose the pdf you want to work on

  • Wait a little bit for the pdf to get vectorized (check task manager to see if your gpu is going vrum)

  • Launch LM Studio, Go to the local Server tab, choose the model you want to run, choose 1234 as server port, start server

  • (If you want to use open-ai or any other cloud LLM services, change line 10 of x/ai_pdf/back_end/inference.py with your api_key and your provider url)

  • Ask questions to the chatbot

  • Get answer

  • Go eat cookies

TODO

  • Option tabs
    • add more different embedding models
    • add menu to choose how many relevant chunk of information the vector search should get from the vector db
    • menu to configure api url and api key

Maybe in the futur

  • Add special support for code PDF (with specialized langchain code spliter)
  • Add Multimodality

ai_pdf's People

Contributors

crizomb avatar

Stargazers

Maycon Moreira avatar

Watchers

Andrea de Luca avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.