Git Product home page Git Product logo

genai-stack's Introduction

GenAI Stack

End-to-End Secure & Private Generative AI for All
(Your data, your LLM, your Control)

Python Versions Discord Twitter Colab

GenAI Stack is an end-to-end framework for the integration of LLMs into any application. It can be deployed on your own infrastructure, ensuring data privacy. It comes with everything you need for data extraction, vector stores, to reliable model deployment.

👉 Join our Discord community!

Getting started on Colab

Try out a quick demo of GenAI Stack on Google Colab:

Open In Colab

Quick install

pip install genai_stack

OR

pip install git+https://github.com/aiplanethub/genai-stack.git

Documentation

The documentation for GenAI Stack can be found at genaistack.aiplanet.com.

GenAI Stack Workflow

GenAI Stack Workflow

What is GenAI Stack all about?

GenAI Stack is an end-to-end framework designed to integrate large language models (LLMs) into applications seamlessly. The purpose is to bridge the gap between raw data and actionable insights or responses that applications can utilize, leveraging the power of LLMs.

In short, it orchestrates and streamlines your Generative AI development journey. From the initial steps of ETL (Extract, Transform, Load) data processing to the refined LLM inference stage, GenAI Stack revolutionizes the way you harness the potential of AI, ensuring data privacy, domain-driven, and ensuring factuality without the pitfalls of hallucinations commonly associated with traditional LLMs.

How can GenAI Stack be helpful?

  1. ETL Simplified: GenAI Stack acts as the guiding hand that navigates the complex landscape of data processing.
  2. Hallucination-Free Inference: Bid adieu to the common headaches associated with AI-generated content filled with hallucinations. Our orchestrator’s unique architecture ensures that the LLM inference stage produces outputs rooted in reality and domain expertise. This means you can trust the information generated and confidently utilize it for decision-making, research, and communication purposes.
  3. Seamless Integration: Integrating the GenAI Stack into your existing workflow is straight forward whether you’re a seasoned AI developer or just starting out.
  4. Customization and Control: Tailor the ETL processes, vector databases, fine-tune inference parameters, and calibrate the system to meet your project’s unique requirements.

Use Cases:

  • AI-Powered Search Engine: Enhance search with context-aware results, moving beyond simple keyword matching.
  • Knowledge Base Q&A: Provide direct, dynamic answers from databases, making data access swift and user-friendly.
  • Sentiment Analysis: Analyze text sources to gauge public sentiment, offering businesses real-time feedback.
  • Customer Support Chatbots: Enhance the operational efficiency of customer support teams with near accurate responses to support queries.
  • Information Retrieval on Large Volumes of Documents: Quickly extract specific information or related documents from vast repositories, streamlining data management.

Get in Touch

You can schedule a 1:1 meeting with our DevRel & Community Team to get started with AI Planet Open Source LLMs(effi and Panda Coder) and GenAI Stack. Schedule the call here: https://calendly.com/jaintarun

Contribution guidelines

GenAI Stack thrives in the rapidly evolving landscape of open-source projects. We wholeheartedly welcome contributions in various capacities, be it through innovative features, enhanced infrastructure, or refined documentation.

For a comprehensive guide on the contribution process, please click here.

Acknowledgements

and the entire OpenSource community.

genai-stack's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

genai-stack's Issues

[Feature] Addition of LLM Chain

  • GenAI Stack version: 0.2.0
  • Python version: 3.8+

Hacktoberfest Accepted PR guidelines

  • Please check the documentation and CONTRIBUTING.md before you start making changes
  • We accept the PR as hacktoberfest-accepted for this issue, if the PR adds the feature of LLM Chain i.e., connecting two LLM model response and make a chain.

[GenAI Stack Server RestAPI's] Data model for request body

  • GenAI Stack version: 2.0.5
  • Python version: 3.8
  • Operating System: macOS Monterey

Description

Currently ETL RestAPI doesn't have a pydantic data model for Request Body. RestAPI's are built using the FastAPI framework and the documentation is autogenerated by the FastAPI itself.

image

from the above image the submit-job endpoint which is a post method and there is no information about the request body structure and type. so adding a data model would give the information about the request body.

Here are the documentations related to
GenAI Stack Server : https://genaistack.aiplanet.com/advanced-guide/genai_stack_server
API's Reference : https://genaistack.aiplanet.com/advanced-guide/openapi

[New Example] Addition Use Cases

  • GenAI Stack version: 0.2.0
  • Python version: 3.8+

Hacktoberfest Accepted PR guidelines

  • Please check the documentation and CONTRIBUTING.md before you start making changes
  • We accept the PR as hacktoberfest-accepted for this issue, if the PR contains a use case with application that makes an impact

unable to install requirement

  • GenAI Stack version:
  • Python version:3.6
  • Operating System:windows

Description

Trying to install required libs from ui/requiremts.txt but getting version related issues

ERROR: Could not find a version that satisfies the requirement attrs==23.1.0 (from -r requirements.txt (line 2)) (from versions: 15.0.0a1, 15.0.0, 15.1.0, 15.2.0, 16.0.0, 16.1.0, 16.2.0, 16.3.0, 17.1.0, 17.2.0, 17.3.0, 17.4.0, 18.1.0, 18.2.0, 19.1.0, 19.2.0, 19.3.0, 20.1.0, 20.2.0, 20.3.0, 21.1.0, 21.2.0, 21.3.0, 21.4.0, 22.1.0, 22.2.0)
ERROR: No matching distribution found for attrs==23.1.0 (from -r requirements.txt (line 2))

ERROR: Could not find a version that satisfies the requirement blinker==1.6.2 (from -r requirements.txt (line 3)) (from versions: 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5)
ERROR: No matching distribution found for blinker==1.6.2 (from -r requirements.txt (line 3))

What I Did

pip install -r requirements.txt

if we remove the version, it is installing

[Model] Addition of new Open Source LLM model

  • GenAI Stack version: 0.2.0
  • Python version: 3.8+

Hacktoberfest Accepted PR guidelines

  • Please check the documentation and CONTRIBUTING.md before you start making changes
  • We accept the PR as hacktoberfest-accepted for this issue, if the PR adds new model in the Model list
    • Currently we support GPT4All and GPT3.5, we appreciate adding new open source LLM model support

Enable Pipeline in Hugging Face models

We need a functionality to use pipelines directly in the HuggingFace model. Lot of DataScientists are very well comfortable with declaring pipelines from Hugging face directly instead of passing it through model_kwargs and pipeline_kwargs which makes it too confusing for them:

How we are currently building pipelines:

llm = HuggingFaceModel.from_kwargs(model=model_name_or_path,
 model_kwargs={"device_map":"cuda","quantization_config":quantization_config,"trust_remote_code":"False","low_cpu_mem_usage":"True"},
 task='text-generation',
 pipeline_kwargs=({"max_new_tokens":512,"do_sample":"True","temperature":0.7,"top_p":0.95,"top_k":40,"repetition_penalty":1.1}))

How some data scientists are expecting to use hugging face:

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.1-GPTQ"
model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
device_map="auto",
trust_remote_code=False,
revision="gptq-8bit-32g-actorder_True")

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.95,
top_k=40,
repetition_penalty=1.1
)
model = HuggingFaceModel.from_kwargs(pipeline = pipe)

While the latter needs more lines of code it gives much more control and customisability in the hands of the data scientist in declaring his model. We can add one more kwarg pipeline in which the user can specify the pipeline directly.

[No-Code] Addition of new blogs

  • GenAI Stack version: 0.2.0
  • Python version: 3.8+

Hacktoberfest Accepted PR guidelines

  • Please check the documentation and CONTRIBUTING.md before you start making changes
  • We accept the PR as hacktoberfest-accepted for this issue, if the PR adds new blogs in the documentation.
    • You can write a blog on GenAI Stack in markdown. We will update it as Community Spotlight along with your name

[Component] Loaders for ETL

  • GenAI Stack version: 0.2.0
  • Python version: 3.8+
  • Operating System: Linux/Windows/Mac

Description

Add custom loaders rather than relying on other tools(Can be contributed individually).

  • PDF
  • CSV
  • Text(.txt)
  • Excel (xlsx)
  • JSON
  • Webpage
  • Youtube
  • Directory of Documents(specific extensions or all from the given directory)
  • Jira
  • hubspot
  • airbyte
  • Databases(MySQL/MariaDB, PostgreSQL, MongoDB, etc...)
  • Git Repo

[Documentation] Working on improving function Flow chart workflows

  • GenAI Stack version: 0.2.0
  • Python version: 3.8+

Hacktoberfest Accepted PR guidelines

  • Please check the documentation and CONTRIBUTING.md before you start making changes
  • We accept the PR as hacktoberfest-accepted for this issue, if the PR improves the existing documentation by addition flow chart diagrams to the model

Similarity search is not accurate/not working in some cases

  • GenAI Stack version: 0.2.5
  • Python version: 3.10
  • Operating System: Ubuntu(WSL)

Description

In some cases even though there is a context present in the source with respect to the question, its unable to query it.

Question : What is the bottom up process?
Response generated : {'output': 'Based on the given context, there is no information provided about the bottom-up process. Therefore, it is not possible to answer the question.'} Response time : 1.0319557189941406

Source file - Employee-Stock-Option-Plans-ESOP-Best-Practices-2.pdf

[Component] Adding YouTube Langchain Loader feature

  • GenAI Stack version: 0.2.0
  • Python version: 3.8+

Hacktoberfest Accepted PR guidelines

  • Please check the documentation and CONTRIBUTING.md before you start making changes
  • We accept the PR as hacktoberfest-accepted for this issue, if the PR adds the support of YouTube Video loaders in ETL.

[Component] Addition of new Vector Database

  • GenAI Stack version: 0.2.0
  • Python version: 3.8+

Hacktoberfest Accepted PR guidelines

  • Please check the documentation and CONTRIBUTING.md before you start making changes
  • We accept the PR as hacktoberfest-accepted for this issue, if the PR adds the support to any one of this Vector Database
    • Milvus
    • Qdrant
    • FAISS
  • Please comment your name along with the Vector Database name to assign the contribution work to you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.