tomzx / autogpt Goto Github PK

View Code? Open in Web Editor NEW

4.0 2.0 0.0 209 KB

Playing around with ChatGPT to automate various processes

License: MIT License

Python 100.00%

autogpt chatgpt artificial-general-intelligence artificial-intelligence auto-gpt

autogpt's Introduction

AutoGPT

AutoGPT is a library to automate interactions with a LLM. It is built on top of OpenAI's ChatGPT and Dask.

Features

See roadmap

Multi-process, multi-thread, distributed using dask
Sessions and interactions recorded in Notion
Reads tasks from Notion
Background mode which polls Notion for new tasks
Budget management
Tasks
- Simple LLM queries
- Summarize a text
- Query multiple personas
- Summarize multiple personas

Getting started

Install python 3.8+
Install poetry
Install docker (if using docker as a backend)
Install kubernetes (if using kubernetes as a backend)
Clone this repository
poetry install

Run without scheduler (local)

Make sure you do not have an environment variable SCHEDULER_URL set nor in your .env file
Run the CLI python cli.py
You can visit the dask dashboard at http://localhost:8787/

Run with scheduler (distributed)

In your .env file, set SCHEDULER_URL=tcp://localhost:8786
Start one dask scheduler and one to many workers (in the poetry environment)
- dask scheduler
- dask-worker --nprocs 1 --nthreads 1 --memory-limit 2GB tcp://localhost:8786
  - Feel free to configure your dask workers as you see fit and to launch as many as you want
Run the CLI python cli.py
You can visit the dask dashboard at http://localhost:8787/

Notion

You can store sessions and interactions in Notion for reviewing/analysis purposes. To do so, you need to set the following environment variables in your .env file:

NOTION_TOKEN
NOTION_TASK_DATABASE_ID
NOTION_SESSION_DATABASE_ID
NOTION_INTERACTION_DATABASE_ID

For AutoGPT to interact with your Notion workspace, go to Settings & members, then Connections, then Develop or manage integrations. Click on New integration, fill out the form and associate it with the workspace you want to use. Once the integration is created, copy the generated token as NOTION_TOKEN.

You then need to create 3 new pages. With the pages created, click on the 3 dots on the top right corner and select Add connections, the pick the integration you just created. Repeat this process for every page you want to give access.

For the task database, you need to create a database with the following properties:

Query: the title property
Status: Select, with the following options: Not started, In progress, Done
Budget: Budget for the task
Created: Creation date time of the task, used to process tasks in ascending order
Updated (optional)

For the session database, you need to create a database with the following properties:

Id: the title property
Interactions: Relation to the interactions database
Budget: Budget for the session
Total cost (optional): Rollup, using th Cost property of the Interactions relation, calculate the sum
Duration (optional): Rollup, using the Duration property of the Interactions relation, calculate the date range
Total interactions (optional): Rollup, using the Interactions property of the Interactions relation, calculate the count
Created (optional)
Updated (optional)

For the interactions database, you need to create a database with the following properties:

Prompt
Response
Task
Cost
Parent (relation to the interactions database, parent interaction)
Children (relation to the interactions database, children interactions)
Sessions (inverse relation to the sessions database)
Created (optional)
Updated (optional)

autogpt's People

Contributors

Stargazers

Watchers

autogpt's Issues

Submit queries to specific agents

When processing the response of an LLM during a task, it should be possible to return as a task response a list of next queries and which agent should be executing the query. This would allow for example the ability to run multiple personas/roles as agents and have them process requests.

Parallelize processing multiple tasks

This can already be done through the HTTP server endpoint but is untested (may have race conditions).

Task dashboard

Allow the operator/user to visualize the current stack of tasks being worked on by the agent.

Process query from stdin

Allow calling echo "Some query" | python cli.py, which should behave just like python cli.py "Some query".

Authentication and Authorization

Only allow authenticated users and agents to call HTTP endpoints.

Add pre/post prompt arguments

Allow users to insert additional prompts before and after the query generated by a given task.

For example
post_prompt=Be concise.

could be added to a query-multiple-personas task.

Handle OpenAI rate limiting

Collaboration between agents

Generate git commit message based on staged code

Let the agent review its work and other agents work

Introduce the concept of worker and reviewer.

Get OpenAI token from environment

Save prompt/response to database

Distributed budget

Share a global budget among many agents.

An initial implementation could be to give a part of the current agent's remaining budget to the new agent. Once the agent terminates, it could report how much budget it used itself. Alternatively, we could record how much budget was spent, and at regular interval adjust the remaining budget.

A more "elegant" solution would be to use a shared database or memory, possibly using dask Lock.

Process requests from notion

Have a table with the original query, the task, the current state.
Have a loop that query notion to get any entry that isn't started yet and process it.

Build prompt-response graphs (link between prompt-response iterations)

Transfer the parent notion interaction id to the new agent

Record executed tasks

Programmer prompt

Design a prompt used to generate python code which can solve the specified goal.
The python code can be a skeleton and get more detailed over time.

Generate readthedocs.com documentation

Start/Stop/Continue a project

In the QueryMultiplePersonas task, it should be possible to call the task many times and not necessitate any call to the LLM if all ther personas have already been resolved.

Personas selector

Select which personas to query when using QueryMultiplePersonas or SummarizeMultiplePersonas using different methods, such as querying the LLM for its opinion, picking randomly, etc.

HTTP server

Run AutoGPT as an HTTP server that can process queries in the same format as the CLI (i.e. task + request/query).
Return in JSON the response and all associated information (e.g., consumed token, cost).

Ideally the HTTP server framework should support HTTP/2 to reduce the need to reconnect between client and server.

Queue tasks

When launching the CLI, it should be possible to specify a sequence of tasks to execute sequentially (maybe a graph of tasks?).

For instance, we currently have SummarizeMultiplePersonas as a task, which is in itself a collection of calls to Simple, followed by a SummarizeResponses.

In #36 we want to implement a personas selector. The idea would be to run SelectPersonas, then QueryMultiplePersonas with the returned list, and finally SummarizeResponses.

We should be able to get rid of SummarizeMultiplePersonas and replace it with QueryMultiplePersonas,SummarizeResponses.

The goal is to increase reusability while reducing the need for tasks that are composite of other tasks.

Submit messages to OpenAI API

Start kubernetes pods

Uniquely identify which agent had which interaction

Web UI

Find a ChatGPT web UI on GitHub and adapt it to call the AutoGPT HTTP backend.

Let AutoGPT work on its own code within a container

Support additional LLM backends

This might be achievable by using a library that already offers support for multiple backends, e.g., LangChain.

Claude (Anthropic)
Coral (Cohere)
Bard (Google)
Replicate

Time budget

We currently support a monetary budget to determine how long the agent should run.
We should be able to also specify a time constraint, after which the agent will stop.

Budget management

Support for complex graph queries

Some complex queries could be submitted as a graph, where an initial query is submitted, which can result in many agents being spawned, after which we would like to take the generated outputs and feed it back into a single query (e.g., submit a single query to multiple roles, then take their individual feedback and summarize them).

Integrate microsoft/guidance

Determine next action

Integrate PrefectHQ/marvin

Dialogue between personas

Make it possible for multiple personas to have a discussion between each other. The number of personas can be unlimited.

Ideally we would spin up one agent per persona and have the messages go back and forth between the agents.
All the conversations should be part of a unique session for each agent.
If an agent budget is consumed, they will stop participating in the discussion.