Git Product home page Git Product logo

private-gpt / db-gpt Goto Github PK

View Code? Open in Web Editor NEW

This project forked from eosphoros-ai/db-gpt

0.0 0.0 0.0 261.11 MB

AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents

Home Page: https://docs.dbgpt.site

License: MIT License

Shell 0.32% JavaScript 0.03% Python 74.07% TypeScript 9.74% CSS 0.07% Makefile 0.18% HTML 15.44% Mako 0.02% Batchfile 0.04% Dockerfile 0.09%

db-gpt's Introduction

DB-GPT: Revolutionizing Database Interactions with Private LLM Technology

What is DB-GPT?

🤖 DB-GPT is an open source AI native data app development framework with AWEL(Agentic Workflow Expression Language) and agents.

The purpose is to build infrastructure in the field of large models, through the development of multiple technical capabilities such as multi-model management (SMMF), Text2SQL effect optimization, RAG framework and optimization, Multi-Agents framework collaboration, AWEL (agent workflow orchestration), etc. Which makes large model applications with data simpler and more convenient.

🚀 In the Data 3.0 era, based on models and databases, enterprises and developers can build their own bespoke applications with less code.

AI-Native Data App



Data-awels

Data-Apps

dashboard-images

Contents

Introduction

The architecture of DB-GPT is shown in the following figure:

The core capabilities include the following parts:

  • RAG (Retrieval Augmented Generation): RAG is currently the most practically implemented and urgently needed domain. DB-GPT has already implemented a framework based on RAG, allowing users to build knowledge-based applications using the RAG capabilities of DB-GPT.

  • GBI (Generative Business Intelligence): Generative BI is one of the core capabilities of the DB-GPT project, providing the foundational data intelligence technology to build enterprise report analysis and business insights.

  • Fine-tuning Framework: Model fine-tuning is an indispensable capability for any enterprise to implement in vertical and niche domains. DB-GPT provides a complete fine-tuning framework that integrates seamlessly with the DB-GPT project. In recent fine-tuning efforts, an accuracy rate based on the Spider dataset has been achieved at 82.5%.

  • Data-Driven Multi-Agents Framework: DB-GPT offers a data-driven self-evolving fine-tuning framework, aiming to continuously make decisions and execute based on data.

  • Data Factory: The Data Factory is mainly about cleaning and processing trustworthy knowledge and data in the era of large models.

  • Data Sources: Integrating various data sources to seamlessly connect production business data to the core capabilities of DB-GPT.

SubModule

  • DB-GPT-Hub Text-to-SQL workflow with high performance by applying Supervised Fine-Tuning (SFT) on Large Language Models (LLMs).

  • dbgpts dbgpts is the official repository which contains some data apps、AWEL operators、AWEL workflow templates and agents which build upon DB-GPT.

Text2SQL Finetune

  • support llms

    • LLaMA
    • LLaMA-2
    • BLOOM
    • BLOOMZ
    • Falcon
    • Baichuan
    • Baichuan2
    • InternLM
    • Qwen
    • XVERSE
    • ChatGLM2
  • SFT Accuracy As of October 10, 2023, through the fine-tuning of an open-source model with 13 billion parameters using this project, we have achieved execution accuracy on the Spider dataset that surpasses even GPT-4!

More Information about Text2SQL finetune

Install

Docker Linux macOS Windows

Usage Tutorial

Features

At present, we have introduced several key features to showcase our current capabilities:

  • Private Domain Q&A & Data Processing

    The DB-GPT project offers a range of functionalities designed to improve knowledge base construction and enable efficient storage and retrieval of both structured and unstructured data. These functionalities include built-in support for uploading multiple file formats, the ability to integrate custom data extraction plug-ins, and unified vector storage and retrieval capabilities for effectively managing large volumes of information.

  • Multi-Data Source & GBI(Generative Business intelligence)

    The DB-GPT project facilitates seamless natural language interaction with diverse data sources, including Excel, databases, and data warehouses. It simplifies the process of querying and retrieving information from these sources, empowering users to engage in intuitive conversations and gain insights. Moreover, DB-GPT supports the generation of analytical reports, providing users with valuable data summaries and interpretations.

  • Multi-Agents&Plugins

    It offers support for custom plug-ins to perform various tasks and natively integrates the Auto-GPT plug-in model. The Agents protocol adheres to the Agent Protocol standard.

  • Automated Fine-tuning text2SQL

    We've also developed an automated fine-tuning lightweight framework centred on large language models (LLMs), Text2SQL datasets, LoRA/QLoRA/Pturning, and other fine-tuning methods. This framework simplifies Text-to-SQL fine-tuning, making it as straightforward as an assembly line process. DB-GPT-Hub

  • SMMF(Service-oriented Multi-model Management Framework)

    We offer extensive model support, including dozens of large language models (LLMs) from both open-source and API agents, such as LLaMA/LLaMA2, Baichuan, ChatGLM, Wenxin, Tongyi, Zhipu, and many more.

  • Privacy and Security

    We ensure the privacy and security of data through the implementation of various technologies, including privatized large models and proxy desensitization.

  • Support Datasources

Image

🌐 AutoDL Image

Language Switching

In the .env configuration file, modify the LANGUAGE parameter to switch to different languages. The default is English (Chinese: zh, English: en, other languages to be added later).

Contribution

Contributors Wall

Licence

The MIT License (MIT)

Citation

If you find DB-GPT useful for your research or development, please cite the following paper:

@article{xue2023dbgpt,
      title={DB-GPT: Empowering Database Interactions with Private Large Language Models}, 
      author={Siqiao Xue and Caigao Jiang and Wenhui Shi and Fangyin Cheng and Keting Chen and Hongjun Yang and Zhiping Zhang and Jianshan He and Hongyang Zhang and Ganglin Wei and Wang Zhao and Fan Zhou and Danrui Qi and Hong Yi and Shaodong Liu and Faqiang Chen},
      year={2023},
      journal={arXiv preprint arXiv:2312.17449},
      url={https://arxiv.org/abs/2312.17449}
}

Contact Information

We are working on building a community, if you have any ideas for building the community, feel free to contact us.

Star History Chart

db-gpt's People

Contributors

aries-ckt avatar csunny avatar yhjun1026 avatar fangyinc avatar xuyuan23 avatar joecryptotoo avatar zhanghy-sketchzh avatar yihong0618 avatar qidanrui avatar qutcat1997 avatar xudafeng avatar 2976151305 avatar quqibing avatar aralhi avatar lbypatrick avatar wangzaistone avatar oushu1zhangxiangxuan1 avatar younisba avatar sheri528 avatar cm-liushaodong avatar junewgl avatar isadba avatar zfanswer avatar lcxadml avatar xiuzhu9527 avatar rinne1998 avatar jsruner avatar yjmm10 avatar thebigbone avatar hpc369 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.