Git Product home page Git Product logo

whitebox-code-gpt's Introduction

Welcome to Whitebox

🎒 Our inventory assistant will deliver a link to the best programming assistant for your use case.

Our goal is to accelerate free high quality AI assistants with GPT builder by allowing experts and users to collaborate openly. Here you'll find instructions & knowledge files for creating next-gen programming assistants.

All ideas are welcome. If you would like to add a new assistant, fork this repository and add your files, then issue a pull request. Also remember to update the index in README.md.
If you would rather maintain the assistant alone, you may issue a pull request adding your link to the partnered index.

If you are experiencing an issue with one of our assistants: kindly open an issue and include the title of the assistant and links to relevant conversation history. If the conversation contains sensitive information, generalized plain text may be copied and pasted.

Twitter | Threads | Discord (new!) |

Existing models:

all assistants are hosted on ChatGPT and are 100% free to use for ChatGPT premium users. Assistants are held to the highest standards and are quality-tested to guarantee a great user experience.


Application-specific:

  • Bioinformatics: Coming soon
  • Controls & Automation Engineering:

Dorkotron for finding everything else.

Partnered models:

Table of Contents

How does it work?

  1. What are Custom GPTs?

    • Custom GPT's allow experts to collaborate and condense their knowledge into a single assistant powered by GPT4. You can read OpenAI's announcement here
    • Because they're hosted on ChatGPT, all Code-GPT assistants can be used freely and require no installation.
      If a user does not have ChatGPT premium, assistants may still be used by copying knowledge files to a different LLM.
  2. Background

    • AI assistants make programmers more effective by suggesting improvments and providing context based on a wide training set of language and code.
    • A key flaw is they cannot be continuously up to date on best practices for every domain. Because of this, all models have blind spots that limit their full potential. To counteract this we must define the blindspots caused by training and create techniques to overcome them.
    • By open-sourcing documents, experts may collaborate, discuss, and fork assistants to create effective assistants for every use case.
  3. Purpose and Function

    • expanded context: The latest generation of multimodal LLMs have the capacity to parse through massive files that would typically overwhelm its context window. If information is structured correctly, this can vastly increase the amount of knowledge availible to a model when working in a known field. For instance we created specific rule sets for each flavor of regex and greatly improved our assistant's ability to create valid patterns that did not mix flavors.
    • Specialization: Each knowledge file is dedicated to a particular entity or topic, providing in-depth information about it. This could include historical data, technical specifications, or any relevant details that aids the assistant's understanding of a topic.
    • Integration with GPT: These files are designed to be integrated into the LLM's existing knowledge base, augmenting its ability to generate accurate and contextually relevant responses about the specific entities.
    • Content Organization: Information within these files is usually organized in a hierarchical or relational manner, allowing the model to understand the connections between different pieces of data.

  4. Creation and Maintenance

    • Data Sourcing: The information in these files is compiled from reliable sources, ensuring accuracy and relevancy. Experts for given frameworks are welcome to contribute files or improvements.
    • Regular Updates: To maintain the relevance of the information, these knowledge files are regularly updated with the latest data.
    • Quality Assurance: Assistants are checked rigorously to ensure accuracy of the information. A secondary goal of this project is to develop automated testing to ensure widespread functionality can be guarunteed for all models.

  5. Impact on GPT Performance

    • Enhanced Accuracy: By having direct access to detailed information, the GPT model can provide better and more accurate responses.
    • Efficiency: Since the data is structured and tailored for quick retrieval, the response time can be faster for queries related to these entities.
    • Customization: This approach allows for customization of the GPT model’s responses based on the specific requirements of the application or domain.

Custodial process:

Since each assistant must be assoicated with a single OpenAI account, we will assign a custodian to manage its state. They are a subject matter experts for their given technology and are the sole decider of what content is included in the official model.

custodian: If you are interested in becoming a custodian, create a fork and add a new folder. Once the new assistant is created, issue a pull request to have it added.

admin: The admin will assess possible candidates and grant custodianship to the most qualified candidate. The admin is the sole decider of who is the official custodian of a assistant but should seek out the opinions of the community before adding or revoking custodianship.

admin: Once the assistant is complete and a link is provided, the admin will confirm the directory in this file is updated and then merge the pull request.

revoking custodianship: If a custodian wishes to forfeit custodianship of an assistant, we ask that they participate in finding a suitable replacement. Once found, we will grant them access and update the directory to reflect the change of ownership.

Making and maintaining assistants:

Activity: Once custodianship is granted, you're free to update your assistant however you see fit. We just ask that you make a reasonable effort to seek and aggregate user requests and improve your assistant, especially during periods of high activity such as when OpenAI updates their models, or a new major revision of a language is released.

Standards: The custodian has the final say in the name and description of a assistant but we ask that they are both descriptive and that the description features a link to this repo. For instance: "Python development made easy. Maintained by Whitebox at https://github.com/Decron/Whitebox"

Experimentation: It may be beneficial to create a backup assistant to experiment with to avoid disrupting users of the primary assistant.

Conversation training: For now we ask that you disable conversation training for the models under your purvue. There are pros and cons of leaving it disabled, and the topic can be addressed later if the community believes conversation training is important.

Less is more: If your assistant is struggling with too many files or over-generlization, you can always split it into multiple assistants.

Are Whitebox assistants safe for enterprises?

For the most part yes, here are the facts:

  • This project is enirely open-source so you may repurpose this repo however you see fit. In return giving credit for our files is appreciated but the decision is ultimately yours.
  • We've asked all custodians to disable conversation training. This setting cannot be truly verified so it is not reccomended to include information you would not want OpenAI to see. Whitebox does not have access to your conversation history.
  • Training based on knowledge files and uploaded documents cannot be disabled with GPT builder. Because of this you should not include sensitive material in knowledge files for our assistants, and you should not upload sensitive files when using them.
  • Unconsented storage of user data by model creators is absolutely prohibbited and will lead to irrevocable dismissal from the project.
  • If you have a custom OpenAI endpoint or you are using our knowledge files on a different LLM, rules about conversation and document training may not apply. Talk to your system administrator.
  • If you would like our assistance creating personalized assistants for your enterprise, please message us at [email protected].

Getting involved:

Contributing

  • The most important thing is to understand GPT4's weaknesses and blind spots. If you find it struggling with certain topics or see complaints online, open an issue or a discussion to help us understand the problem.
  • Secondly, we need to get the word out about this new technology. Share this repo with people you think would be interested, and invite domain experts to contribute by claiming assistants.
  • If you're reading this we want to hear your use case. What annoys you most about programming assistants? Go open a discussion and we'll do our best to improve your experience.
  • If you don't have access to ChatGPT premium, we'd love to collaborate on other applications for our knowledge files.
  • If you'd like to hear announcements about new assistant releases and partnered agents, follow us for free on Substack

Support

  • Whitebox is maintained entirely by volunteers. If you would like to donate to the project, see our Donation Link
  • If you're interested in Whitebox swag, we have a merch page here

"I don't like reading is there a GPT that will spoonfeed this to me?"

Yes: https://chat.openai.com/g/g-cwigWCh11-code-gpt-gpt

Sponsors

We are actively seeking organizations to sponsor this project so we may deliver the best possible programming assistants. If you're interested in sponsoring us please send all inquiries to [email protected]

This project was brought to you by The Hadrio Group. We are a San Francisco based community of MIT and UC Berkeley alumni that focuses on quality and data stewardship in AI.

Additional models

This project is geared to optimize assistants for the custom GPT marketplace provided by OpenAI. If you find that our knowledge files transfer effectively to other models, we would be very interested in hearing more about it.





Wander with confidence.

whitebox-code-gpt's People

Contributors

3jame avatar decron avatar eltociear avatar thehadriogroup avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

whitebox-code-gpt's Issues

regex action not up to date with marketplace

Hi there, I noticed that the regex gpt is not up to date with the conversations starters in the chatgpt website. I was also wondering how does the gpt know which file to use for which language? Is ChatGPT smart enough to read the filenames and know that if the user says they are using go it should reference the go file?

Conversations starters in "Regex Assistant by Whitebox" regex gpt:

What styles can you interpret?
Explain this regex pattern: \d{2,4}
Create a positive lookbehind
Create a regex to find email addresses

C#: dot net core

User on reddit mentioned GPT lacks a good understanding of dot net core. Let's dig in to what exactly it struggles with and look in to possible solutions.

Might take form of just uploading a public domain dot net core pdf for now but as a long term goal we should work to condense the guide into a more hierarchical form to allow better parsing.

New GPT request: Beginner AI Content Creator's Assistant

Description:

I propose the creation of a specialized GPT, named "Content Creator's Assistant," aimed at supporting the generation of content focused on a broad range of AI-powered solutions, catering especially to beginners. This GPT will encompass a wide array of topics, not limited to but including no-code builders, chatbots, automation systems, and various AI-driven tools.

Expanded Focus Topics:

No-Code Builders: Covering platforms that allow the creation of applications and websites without coding knowledge.
Chatbots and Messaging Platforms: Including ManyChat, as well as other popular platforms like Chatfuel, Drift, and Intercom.
Automation Tools: Encompassing a range of tools such as Make (formerly Integromat), Zapier, and Bardeen, and exploring their use in automating tasks and workflows.
AI-Powered Marketing Tools: Discussing AI-driven solutions for marketing, such as AI content generators, customer data analysis tools, and personalized marketing automation.
Machine Learning Interfaces: Tools that simplify machine learning for non-experts, like Google AutoML, IBM Watson, and Amazon SageMaker.
Target Audience:
This GPT will primarily benefit beginners in AI and technology enthusiasts, as well as content creators, bloggers, and digital marketers looking to demystify AI technologies and automation tools for a general audience.

Functionality:

Idea Generation: Generating creative and informative blog post ideas across a diverse range of AI and automation topics.
Content Outlining: Providing structured outlines suitable for both short-form and long-form content, guiding writers through the process of creating coherent and engaging articles.
Accessibility: Ensuring content is beginner-friendly, breaking down complex concepts into easily digestible information.
Purpose:
The aim of this GPT is to facilitate content creation around a variety of AI-powered solutions, making it simpler for writers to produce educational and engaging content that appeals to AI novices and enthusiasts alike. This tool will serve as a valuable resource in fostering a deeper understanding and interest in AI and automation technologies among a broader audience.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.