Git Product home page Git Product logo

yodck / skycode-ai-codex-gpt3 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from skyworkaigc/skycode-ai-codex-gpt3

0.0 0.0 0.0 31 KB

SkyCode is an open source programming model, which adopts the GPT3 model structure. It supports Java, JavaScript, C, C++, Python, Go, shell and other languages, and can understand Chinese comments. / SkyCode是一个多语言开源编程大模型,采用GPT3模型结构,支持Java, JavaScript, C, C++, Python, Go, shell等多种主流编程语言,并能理解中文注释。模型可以对代码进行补全,进行解题等操作,使您从编程中解放出来,专心于解决更大的问题。

Home Page: https://sky-code.singularity-ai.com/index.html#/

License: MIT License

skycode-ai-codex-gpt3's Introduction

SkyCode

SkyCode is a multi-language open source programming model released by Singularity-AI. It adopts the GPT3 model structure and uses a large amount of code for training. Support Java, JavaScript, C, C++, Python, Go, shell and other mainstream programming languages, and can understand Chinese comments. The model can complete the code, solve problems and other operations, freeing you from programming and focusing on solving larger problems.

Project Highlights

  1. Technical advantage 1: covering multiple programming languages

    Different programming languages focus on solving problems in different platforms and environments, and different programming languages have their own reasons for existence. The codes that Singularity SkyCode can generate not only include widely used JavaScript, python, Java, C, etc., but also cover more than ten programming languages such as php, go, and swift, so that users of different languages can experience SkyCode has powerful code generation capabilities.

  2. Technical advantage 2: optimize for Chinese annotations

    In the field of pre-training large models, it has always been dominated by the English community. The code generation model based on GPT3 has the same problem. Relying on the experience of deeply cultivating Chinese models, Singularity-AI optimized and innovated a unique Chinese encoding method according to the characteristics of Chinese, which is more in line with Chinese language habits, making the model's ability to understand Chinese annotations better.

  3. Technical advantage 3: excellent problem-solving ability

    On the HumanEval data set that reflects the problem-solving ability of the code generation model, the problem-solving ability of SkyCode is also much higher than that of other open source models.

    model pass@1 pass@10 pass@100
    GPT-Neo 1.3B 4.79% 7.47% 16.30%
    GPT-Neo 2.7B 6.41% 11.27% 21.37%
    GPT-J 6B 11.62% 15.74% 27.74%
    SKY_code(2.6B) 12.84% 21.07% 35.97%

    It can be seen that SkyCode with a parameter quantity of 2.6B is not only much higher than the GPT-Neo 1.3B model with fewer parameters, but also much higher than the GPT-Neo 2.7B model with a comparable parameter quantity. Even compared to the GPT-J 6B model with a higher number of parameters, SkyCode's problem-solving ability is stronger. In the pass@100 indicator that better reflects the upper limit of problem-solving ability, SkyCode's net value exceeds GPT-J by 8.23%.

News of Singularity-AI

Installation

Recommand
transformers>=4.18.0

Model Usage

# -*- coding: utf-8 -*-
from transformers import GPT2LMHeadModel
from transformers import AutoTokenizer
from transformers import TextGenerationPipeline

model = GPT2LMHeadModel.from_pretrained("SkyWork/SkyCode")
tokenizer = AutoTokenizer.from_pretrained("SkyWork/SkyCode", trust_remote_code=True)
text_generator = TextGenerationPipeline(model, tokenizer, device=0)
input_str = "if __name__"
max_new_tokens = 40
print(text_generator(input_str, max_new_tokens=max_new_tokens, do_sample=True))### 

License

MIT License

Developer Group

Scan the QR Code below to join in the developer group of SkyCode

code

skycode-ai-codex-gpt3's People

Contributors

skyworkaigc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.