Git Product home page Git Product logo

opencsgs / csghub-server Goto Github PK

View Code? Open in Web Editor NEW
83.0 13.0 18.0 29.98 MB

CSGHub Server is the backend server for CSGHub which helps user to manage datasets, model files, codes and more. CSGHub Server是开源大模型资产管理平台CSGHub的服务端部分的开源项目,提供基于REST API的模型和数据集等大模型资产管理功能。欢迎关注反馈和Star⭐️

Home Page: https://opencsg.com/models

License: Apache License 2.0

Dockerfile 0.05% Go 95.60% Shell 0.81% PLpgSQL 0.39% Python 3.15%
ai huggingface llm platform datasets golang models

csghub-server's Introduction

English简体中文

CSGHub Server is a part of the open source and reliable large model assets management platform - CSGHub. It focuses on management of models、datasets and other LLM assets through REST API。

Key Features:

  • Creation and Management of users and orgnizations
  • Auto-tagging of model and dataset labels
  • Search for users, organizations, models, and data
  • Online preview of dataset files, like .parquet file
  • Content moderation for both text and image
  • Download of individual files, including LFS files
  • Tracking of model and dataset activity data, such as downloads and likes volume

Demo

In order to help users to quickly understand the features and usage of CSGHub, we have recorded a demo video. You can watch this video to get a quick understanding of the main features and operation procedures of this program.

  • CSGHub Demo video is as blew,you can also check it at YouTube or Bilibili
    csghub-demo-1080p.mp4

Please visit the OpenCSG website to experience the powerful management features.

Quick Start

System resource requirements: 4c CPU/8GB memory

Please install Docker yourself. This project has been tested in Ubuntu22 environment.

You can quickly deploy the localized CSGHub Server service through docker-compose:

# The API token should be at least 128 characters long, and HTTP requests to csghub-server require the API token to be sent as a Bearer token for authentication.
export STARHUB_SERVER_API_TOKEN=<API token>
mkdir -m 777 gitea minio_data
curl -L https://raw.githubusercontent.com/OpenCSGs/csghub-server/main/docker-compose.yml -o docker-compose.yml
docker-compose -f docker-compose.yml up -d

Technical Architecture

csghub-server architecture

Extensible and customizable

  • Supports different git servers, such as Gitea, GitLab, etc.
  • Supports flexible configuration of the LFS storage system, and you can choose to use local or any third-party cloud storage service that is compatible with the S3 protocol.
  • Enable content moderation on demand, and choose any third-party content moderation service.

Roadmap

  • Support more Git Servers: Currently supports Gitea, and plans to support mainstream Git repositories in the future.
  • Git LFS: Git LFS supports large files, and supports Git command operations and online download through the Web UI.
  • DataSet online viewer: Data set preview, supports the Top20/TopN loading preview of LFS format data sets.
  • Model/Dataset AutoTag: Supports custom metadata and automatic extraction of model/dataset tags.
  • S3 Protocol Support: Supports S3 (MinIO) storage protocol, providing higher reliability and storage cost-effectiveness.
  • Model format convert: Conversion of mainstream model formats.
  • Model oneclick deploy: Supports integration with OpenCSG llm-inference, one-click to start model inference.

License

We use the Apache 2.0 license, the content of which is detailed in the LICENSE file.

Contributing

If you wish to contribute, please follow the Contribution Guidelines. We are very excited about your contributions!

Acknowledgments

This project is based on open source projects such as Gin, DuckDB, minio, and Gitea. We would like to express our sincere gratitude to them for their open source contributions!

CONTACT WITH US

If you meet any problem during usage, you can contact with us by any following way:

  1. initiate an issue in github
  2. join our WeChat group by scaning wechat helper qrcode
  3. join our offical discord channel: OpenCSG Discord Channel
  4. join our slack workspace:OpenCSG Slack Channel
                                     

csghub-server's People

Contributors

ganisback avatar kinglywayne avatar lwf0019 avatar pulltheflower avatar rader avatar seanhh86 avatar wanghh2000 avatar wayneliu0019 avatar zhanglongchun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csghub-server's Issues

Support for Additional Git Servers

The current implementation of CSGHub Server only supports Gitea as the Git server. As mentioned in the roadmap, it would be beneficial to support more mainstream Git repositories in the future. I propose adding support for GitLab and GitHub as additional Git servers.

Motivation

This feature would allow users to integrate CSGHub Server with their existing Git workflows, making it more versatile and appealing to a wider range of users.

Expected Behavior

The CSGHub Server should be able to connect to and manage repositories on GitLab and GitHub, in addition to Gitea.

Acceptance Criteria

  • The CSGHub Server can successfully connect to GitLab and GitHub repositories.
  • The CSGHub Server can manage repositories on GitLab and GitHub, including creating, updating, and deleting repositories.
  • The CSGHub Server can handle authentication and authorization for GitLab and GitHub repositories.

服务端镜像启动报错-无法加载 duckdb extension httpfs

在离线环境下运行服务端镜像,报错如下:
init logger, level: INFO, format: json
all_in_one-csghub_server-1 | {"time":"2024-04-27T06:49:16.456842199Z","level":"INFO","msg":"FIFOScheduler run started"}
all_in_one-csghub_server-1 | {"time":"2024-04-27T06:49:16.457924209Z","level":"ERROR","msg":"refresh status all failed","error":"Get "http://localhost:8082/status-all\": dial tcp [::1]:8082: connect: connection refused"}
all_in_one-csghub_server-1 | Error: failed to init router: error creating dataset viewer handler:failed to create parquet reader,cause:failed to setup s3 for duckdb, cause:IO Error: Failed to download extension "httpfs" at URL "http://extensions.duckdb.org/v0.9.2/linux_amd64/httpfs.duckdb_extension.gz"

cannot pass parameter endpoint by cli

  1. endpoint can not be use by cli parameter
  2. file_download error for repo type of dataset
Traceback (most recent call last):
  File "/Users/hhwang/code/jihulab/opencsg/csghub-sdk/test.py", line 21, in <module>
    result = snapshot_download(repo_id=repoid, cache_dir=cachedir, endpoint=ep, token=my_token, repo_type=repotype)
  File "/Users/hhwang/code/jihulab/opencsg/csghub-sdk/pycsghub/snapshot_download.py", line 61, in snapshot_download
    repo_info = utils.get_repo_info(repo_id,
  File "/Users/hhwang/code/jihulab/opencsg/csghub-sdk/pycsghub/utils.py", line 149, in get_repo_info
    return method(
  File "/Users/hhwang/code/jihulab/opencsg/csghub-sdk/pycsghub/utils.py", line 355, in model_info
    r.raise_for_status()
  File "/Users/hhwang/anaconda3/envs/hubsdk/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://hub-stg.opencsg.com/hf/api/models/wanghh2003/myds1/revision/main

服务无法重启

服务关机后无法重启,一直卡在:
[root@localhost all_in_one]# docker logs 166aee40504a
Waiting for Gitea service to be ready...

需要删掉启动目录下gitdata,gitlogs,在执行启动才能成功,但是此操作数据会丢失。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.