Git Product home page Git Product logo

auto-paper-analysis's Introduction

Auto Paper Analysis

This project automatically generate Questions and Answers on a given arXiv ids. For now, the CLI tool only supports to grasp arXiv ids from Hugging Face ๐Ÿค— Daily Papers. Also, it is possible to directly generate on a set of arXiv ids.

You can see the generated QA dataset from chansung/auto-paper-qa2 repository. Also, you can see how these dataset could be used with PaperQA space application.

Instruction

If you want to do prompt engineering, modify the prompts.toml file. There are two prompts to play with.

Hugging Face ๐Ÿค— Daily Papers

To generate QAs of arXiv papers on a specific date, run:

export GEMINI_API_KEY=<YOUR-GEMINI-API>
export HF_ACCESS_TOKEN=<YOUR-HF-ACCESS-TOKEN>

python app.py --target-date $current_date \
    --gemini-api $GEMINI_API_KEY \
    --hf-token $HF_ACCESS_TOKEN \
    --hf-repo-id $hf_repo_id \
    --hf-daily-papers

If you want to generate QAs of arXiv papers on the range of date, run:

export GEMINI_API_KEY=<YOUR-GEMINI-API>
export HF_ACCESS_TOKEN=<YOUR-HF-ACCESS-TOKEN>
export HF_DATASET_REPO_ID=<YOUR-HF-DATASET-REPO-ID>

./date_iterator.sh "2024-03-01" "2024-03-03" $HF_DATASET_REPO_ID

arXiv Ids

To generate QAs of arXiv papers on a list of arXiv IDs, run:

export GEMINI_API_KEY=<YOUR-GEMINI-API>
export HF_ACCESS_TOKEN=<YOUR-HF-ACCESS-TOKEN>

python app.py \
    --gemini-api $GEMINI_API_KEY \
    --hf-token $HF_ACCESS_TOKEN \
    --hf-repo-id $hf_repo_id \
    --arxiv-ids <arxiv-id> <arxiv-id> ...

Acknowledgements

This is a project built during the Gemini sprint held by Google's ML Developer Programs team. I am thankful to be granted good amount of GCP credits to finish up this project.

auto-paper-analysis's People

Contributors

deep-diver avatar hobeom avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.